Python yordamida ovoz orqali jinsni aniqlash



Yüklə 73,94 Kb.
tarix07.01.2024
ölçüsü73,94 Kb.
#207684

O‘ZBEKISTON RESPUBLIKASI AXBOROT TEXNOLOGIYALARI VA KOMMUNIKATSIYALARINI RIVOJLANTIRISH VAZIRLIGI MUHAMMAD AL-XORAZMIY NOMIDAGI TOSHKENT AXBOROT TEXNOLOGIYALARI UNIVERSITETI

Muhammad al-Xorazmiy nomidagi Toshkent axborot texnologiyalari universiteti
“Dasturiy injinering” fakulteti 315-20 guruh talabasi
Asadov Mironshoh
Timsollarni tanib olish fanidan
2-amaliy ishi

Toshkent-2023



Python yordamida ovoz orqali jinsni aniqlash



import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)


# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list the files in the input directory


import os
print(os.listdir("../input")


/kaggle/input/voice.csv


import warnings
warnings.filterwarnings('ignore')
# read file
voice=pd.read_csv('../input/voice.csv')
voice.head()


voice.info()



RangeIndex: 3168 entries, 0 to 3167
Data columns (total 21 columns):
meanfreq 3168 non-null float64
sd 3168 non-null float64
median 3168 non-null float64
Q25 3168 non-null float64
Q75 3168 non-null float64
IQR 3168 non-null float64
skew 3168 non-null float64
kurt 3168 non-null float64
sp.ent 3168 non-null float64
sfm 3168 non-null float64
mode 3168 non-null float64
centroid 3168 non-null float64
meanfun 3168 non-null float64
minfun 3168 non-null float64
maxfun 3168 non-null float64
meandom 3168 non-null float64
mindom 3168 non-null float64
maxdom 3168 non-null float64
dfrange 3168 non-null float64
modindx 3168 non-null float64
label 3168 non-null object
dtypes: float64(20), object(1)
memory usage: 519.8+ KB


voice.describe()

Oldindan ishlov berish: yorliq kodlovchi va normalizatsiya




from sklearn import preprocessing
le = preprocessing.LabelEncoder()
voice["label"] = le.fit_transform(voice["label"])
le.classes_
array(['female', 'male'], dtype=object)
voice[:]=preprocessing.MinMaxScaler().fit_transform(voice)
voice.head()

Vizualizatsiya


import seaborn as sns
import matplotlib.pyplot as plt
plt.subplots(4,5,figsize=(15,15))
for i in range(1,21):
plt.subplot(4,5,i)
plt.title(voice.columns[i-1])
sns.kdeplot(voice.loc[voice['label'] == 0, voice.columns[i-1]], color= 'green', label='F')
sns.kdeplot(voice.loc[voice['label'] == 1, voice.columns[i-1]], color= 'blue', label='M')

Bir qarashda, eng muhim xususiyatlar Q25, IQR va ma'noli. Biz 20 ta xususiyat va 3 ta alohida xususiyatdan foydalangan holda modellar yaratamiz.



Modellarni yaratish uchun K-Yaqin qo'shnilar, Naive Bayes, Decision Tree, Random Forest, XgBoost, Support vektor mashinasi, Neyron tarmog'idan foydalanish

from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

from sklearn import neighbors
from sklearn import naive_bayes
from sklearn import tree
from sklearn import ensemble
from sklearn import svm
from sklearn import neural_network
import xgboost
# Split the data
train, test = train_test_split(voice, test_size=0.3)


train.head()

K-Eng yaqin qo'shnilar
Modelni yaratish uchun qo'shnis.KNeighborsClassifier() dan foydalanish.
def knn_error(k,x_train,y_train,x_test,y_test):
error_rate = []
K=range(1,k)
for i in K:
knn = neighbors.KNeighborsClassifier(n_neighbors = i)
knn.fit(x_train, y_train)
y_pred = knn.predict(x_test)
error_rate.append(np.mean(y_pred != y_test))
kloc = error_rate.index(min(error_rate))
print("Lowest error is %s occurs at k=%s." % (error_rate[kloc], K[kloc]))


plt.plot(K, error_rate, color='blue', linestyle='dashed', marker='o',
markerfacecolor='red', markersize=10)
plt.title('Error Rate vs. K Value')
plt.xlabel('K')
plt.ylabel('Error Rate')
plt.show()
return K[kloc]


k=knn_error(21,x_train,y_train,x_test,y_test)
Eng kichik xatolik 0,02103049421661409 k=4 da sodir bo'ladi.



model = neighbors.KNeighborsClassifier(n_neighbors = k)
classify(model,x_train,y_train,x_test,y_test)





k=knn_error(21,x_train3,y_train3,x_test3,y_test3)
Eng past xato 0,019978969505783387 k=10 da sodir bo'ladi.


Kuzatish 1

  1. Ma'lumotlar bir tekis taqsimlanmagan.

  2. Dam olish kunlarida daromad kamroq bo'ladi.

  3. Noyabr oyida daromad ko'proq bo'ladi.








Yüklə 73,94 Kb.

Dostları ilə paylaş:




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©azkurs.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin