scikit-learn Cookbook(Second Edition)
上QQ阅读APP看书,第一时间看更新

How to do it...

  1. Continuing with the Boston dataset, run the following commands:
X[:, :3].mean(axis=0) #mean of the first 3 features

array([ 3.59376071, 11.36363636, 11.13677866])

X[:, :3].std(axis=0)

array([ 8.58828355, 23.29939569, 6.85357058])
  1. There's actually a lot to learn from this initially. Firstly, the first feature has the smallest mean but varies even more than the third feature. The second feature has the largest mean and standard deviation—it takes the widest spread of values:
X_2 = preprocessing.scale(X[:, :3])
X_2.mean(axis=0)

array([ 6.34099712e-17, -6.34319123e-16, -2.68291099e-15])

X_2.std(axis=0)

array([ 1., 1., 1.])