code

sklearn에서 'transform'과 'fit_transform'의 차이점은 무엇입니까

codestyles 2020. 10. 17. 10:33
반응형

sklearn에서 'transform'과 'fit_transform'의 차이점은 무엇입니까


sklearn - 파이썬 도구 상자에서 두 가지 기능이 있습니다 transformfit_transformsklearn.decomposition.RandomizedPCA. 두 가지 기능에 대한 설명은 다음과 같습니다.

여기에 이미지 설명 입력 여기에 이미지 설명 입력

그러나 그들 사이의 차이점은 무엇입니까?


여기서 차이점은 이미 행렬에서 PCA를 계산 한 경우에만 pca.transform을 사용할 수 있습니다.

   In [12]: pc2 = RandomizedPCA(n_components=3)

    In [13]: pc2.transform(X) # can't transform because it does not know how to do it.
    ---------------------------------------------------------------------------
    AttributeError                            Traceback (most recent call last)
    <ipython-input-13-e3b6b8ea2aff> in <module>()
    ----> 1 pc2.transform(X)

    /usr/local/lib/python3.4/dist-packages/sklearn/decomposition/pca.py in transform(self, X, y)
        714         # XXX remove scipy.sparse support here in 0.16
        715         X = atleast2d_or_csr(X)
    --> 716         if self.mean_ is not None:
        717             X = X - self.mean_
        718 

    AttributeError: 'RandomizedPCA' object has no attribute 'mean_'

    In [14]: pc2.ftransform(X) 
    pc2.fit            pc2.fit_transform  

    In [14]: pc2.fit_transform(X)
    Out[14]: 
    array([[-1.38340578, -0.2935787 ],
           [-2.22189802,  0.25133484],
           [-3.6053038 , -0.04224385],
           [ 1.38340578,  0.2935787 ],
           [ 2.22189802, -0.25133484],
           [ 3.6053038 ,  0.04224385]])

사용 .transform하려면 pca에 변환 규칙을 가르쳐야합니다.

In [20]: pca = RandomizedPCA(n_components=3)

In [21]: pca.fit(X)
Out[21]: 
RandomizedPCA(copy=True, iterated_power=3, n_components=3, random_state=None,
       whiten=False)

In [22]: pca.transform(z)
Out[22]: 
array([[ 2.76681156,  0.58715739],
       [ 1.92831932,  1.13207093],
       [ 0.54491354,  0.83849224],
       [ 5.53362311,  1.17431479],
       [ 6.37211535,  0.62940125],
       [ 7.75552113,  0.92297994]])

In [23]: 

특히 PCA 변환은 행렬 X의 PCA 분해로 얻은 기저 변화를 행렬 Z에 적용합니다.


에서 추정 API를 scikit 배우기 ,

fit() : 학습 데이터에서 학습 모델 매개 변수 생성에 사용

transform(): fit()메서드에서 생성 된 매개 변수 , 모델에 적용하여 변환 된 데이터 세트를 생성합니다.

fit_transform() : combination of fit() and transform() api on same data set

여기에 이미지 설명 입력

Checkout Chapter-4 from this book & answer from stackexchange for more clarity


These methods are used to center/feature scale the given data. It basically helps to normalize the data within a particular range

For this, we use Z-score method.

Z- 점수

We do this on the training set of data.

1.Fit(): Method calculates the parameters μ and σ and saves them as internal objects.

2.Transform(): Method using these calculated parameters apply the transformation to a particular dataset.

3.Fit_transform(): joins the fit() and transform() method for transformation of dataset.

Code snippet for Feature Scaling/Standardisation(after train_test_split).

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
sc.fit_tranform(X_train)
sc.tranform(X_test)

We apply the same(training set same two parameters μ and σ (values)) parameter transformation on our testing set.


Generic difference between the methods:

  • fit(raw_documents[, y]): Learn a vocabulary dictionary of all tokens in the raw documents.
  • fit_transform(raw_documents[, y]): Learn the vocabulary dictionary and return term-document matrix. This is equivalent to fit followed by the transform, but more efficiently implemented.
  • transform(raw_documents): Transform documents to document-term matrix. Extract token counts out of raw text documents using the vocabulary fitted with fit or the one provided to the constructor.

Both fit_transform and transform returns the same, Document-term matrix.

Source


다음은 b / w .fit () 및 .fit_transform ()의 기본적인 차이점입니다.

.fit(): is use in the Supervised learning having two object/parameter(x,y) to fit model and make model to run, where we know that what we are going to predict, while
.fit_transform()L is use in Unsupervised Learning having one object/parameter(x), where we don't know, what we are going to predict.

참고 URL : https://stackoverflow.com/questions/23838056/what-is-the-difference-between-transform-and-fit-transform-in-sklearn

반응형