[파머완 2장] 3. 사이킷런의 기반 프레임워크 익히기

Notice

Recent Posts

Recent Comments

Link

« 2025/05 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

hyeonzzz's Tech Blog

[파머완 2장] 3. 사이킷런의 기반 프레임워크 익히기 본문

Machine Learning

[파머완 2장] 3. 사이킷런의 기반 프레임워크 익히기

hyeonzzz 2024. 1. 4. 20:46

2. 사이킷런으로 시작하는 머신러닝 - 사이킷런의 기반 프레임워크 익히기

1) Estimator 클래스

사이킷런 클래스는 fit( )과 predict( )만을 이용해 간단하게 학습과 예측 결과를 반환한다
Classifier : 분류 알고리즘을 구현한 클래스
Regressor : 회귀 알고리즘을 구현한 클래스
Estimator : Classifier + Regressor (지도학습의 모든 알고리즘을 구현한 클래스를 통칭)

2) 사이킷런에 내장된 데이터 세트 형태

일반적으로 딕셔너리 형태이다. 키는 보통 data, target, target_name, feature_names, DESCR로 구성돼 있다

data : 피처의 데이터 세트
target : 분류 - 레이블 값, 회귀 - 숫자 결괏값 데이터 세트
target_names : 개별 레이블의 이름
feature_names : 피처의 이름
DESCR : 데이터 세트에 대한 설명과 각 피처의 설명

붓꽃 데이터 세트 생성

from sklearn.datasets import load_iris

iris_data = load_iris()
print(type(iris_data))

<class 'sklearn.utils._bunch.Bunch'>

Bunch 클래스는 파이썬 딕셔너리 자료형과 유사하다

대부분의 데이터 세트는 딕셔너리 형태이다

key 값 확인

keys = iris_data.keys()
print('붓꽃 데이터 세트의 키들:', keys)

붓꽃 데이터 세트의 키들: dict_keys(['data', 'target', 'frame', 'target_names', 'DESCR', 'feature_names', 'filename', 'data_module'])

feature_names, target_name, data, target이 가리키는 값 출력

print('\n feature_names 의 type:',type(iris_data.feature_names))
print(' feature_names 의 shape:',len(iris_data.feature_names))
print(iris_data.feature_names)

print('\n target_names 의 type:',type(iris_data.target_names))
print(' feature_names 의 shape:',len(iris_data.target_names))
print(iris_data.target_names)

print('\n data 의 type:',type(iris_data.data))
print(' data 의 shape:',iris_data.data.shape)
print(iris_data['data'])

print('\n target 의 type:',type(iris_data.target))
print(' target 의 shape:',iris_data.target.shape)
print(iris_data.target)

 feature_names 의 type: <class 'list'>
 feature_names 의 shape: 4
['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']

 target_names 의 type: <class 'numpy.ndarray'>
 feature_names 의 shape: 3
['setosa' 'versicolor' 'virginica']

 data 의 type: <class 'numpy.ndarray'>
 data 의 shape: (150, 4)
[[5.1 3.5 1.4 0.2]
 [4.9 3.  1.4 0.2]
 [4.7 3.2 1.3 0.2]
........
........
 [6.3 2.5 5.  1.9]
 [6.5 3.  5.2 2. ]
 [6.2 3.4 5.4 2.3]
 [5.9 3.  5.1 1.8]]

 target 의 type: <class 'numpy.ndarray'>
 target 의 shape: (150,)
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2]

'Machine Learning' 카테고리의 다른 글

[파머완 2장] 5. 데이터 전처리 (1)	2024.01.09
[파머완 2장] 4. Model Selection 모듈 소개 (1)	2024.01.09
[파머완 2장] 1. 사이킷런 소개와 특징 2. 붓꽃 품종 예측하기 (1)	2024.01.04
[파머완 1장] 4. 판다스 (Pandas) (1)	2024.01.04
[파머완 1장] 3. 넘파이 (NumPy) (0)	2024.01.02

'Machine Learning' Related Articles

hyeonzzz's Tech Blog

[파머완 2장] 3. 사이킷런의 기반 프레임워크 익히기 본문

[파머완 2장] 3. 사이킷런의 기반 프레임워크 익히기

2. 사이킷런으로 시작하는 머신러닝 - 사이킷런의 기반 프레임워크 익히기

1) Estimator 클래스

2) 사이킷런에 내장된 데이터 세트 형태

붓꽃 데이터 세트 생성

key 값 확인

feature_names, target_name, data, target이 가리키는 값 출력

'Machine Learning' 카테고리의 다른 글

티스토리툴바