本篇文章采用的数据集为TensorFlow提供的波士顿房价数据集,仅有数据506个,并且404个用来训练,拿出102个用来当作测试数据集。
本篇文章TensorFlow版本为TensorFlow 2.0 Beta,Python3
import tensorflow as tf
import matplotlib.pyplot as plt
(train_data,train_label),(test_data,test_label) = tf.keras.datasets.boston_housing.load_data()
print(train_data.shape)
print(test_data.shape)
通过观察发现拥有404条训练数据和102条训练数据且有13个不同的特征属性,13个特征属性代表如下:
对数据进行处理
# 对数据进行处理(使用dataset初始数据)
ds_train = tf.data.Dataset.from_tensor_slices((train_data,train_label))
ds_test = tf.data.Dataset.from_tensor_slices((test_data,test_label))
# 设置对数据进行乱序+重复+批次处理
ds_train = ds_train.shuffle(train_label.shape[0]).repeat().batch(101)
ds_test = ds_test.batch(101)
def build_model():
# 创建模型
model = tf.keras.Sequential([
# 输入层
tf.keras.layers.Dense(64,activation=tf.nn.relu,input_shape=(13,)),
# 隐藏层
tf.keras.layers.Dense(64,activation=tf.nn.relu),
# 输出层
tf.keras.layers.Dense(1)
])
# 编译模型
model.compile(
optimizer=tf.optimizers.Adam(0.001),
loss=tf.losses.mse,
metrics=['mae']
)
return model
model = build_model()
print(model.summary())
# 每一个epoch的迭代次数
train_steps_per_epochs = train_data.shape[0] // 101
test_steps_per_epochs = test_data.shape[0] // 101
# 设置模型自动停止
early_stop = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=20)
# 训练模型(500个epoch)并放入history
history =model.fit(
ds_train,
epochs=500,
steps_per_epoch=train_steps_per_epochs,
validation_data=ds_test,
validation_steps=test_steps_per_epochs,
callbacks=[early_stop]
)
训练过程的数据都保存在了history里面,我们可以通过画图来查看训练过程:
# 绘图函数
def plot_history(history):
plt.subplot(2, 1, 1)
plt.title('loss')
plt.plot(history.epoch, history.history.get('loss'), label='loss')
plt.plot(history.epoch, history.history.get('val_loss'), label='val_loss')
plt.legend()
plt.subplot(2, 1, 2)
plt.title('mae')
plt.plot(history.epoch, history.history.get('mae'), label='mae')
plt.plot(history.epoch, history.history.get('val_mae'), label='val_mae')
plt.legend()
plt.show()
plot_history(history)
test_predictions = model.predict(test_data).flatten()
print(test_predictions)
[11.642183 19.59567 21.726236 34.196423 22.4167 22.425404
28.799603 23.66306 19.806677 19.32042 15.688759 17.867088
16.530552 40.41346 19.615501 21.516493 24.449144 19.616917
17.958574 26.871962 11.995824 8.774342 20.539331 15.862772
22.449028 22.83855 30.204325 38.96432 13.9990225 22.611979
21.11433 15.393939 34.931477 23.673939 17.807938 9.601786
15.747133 13.715659 20.602491 29.066439 25.852331 23.244295
16.11424 33.486485 41.81491 24.000952 31.267826 19.320993
26.541712 22.562246 37.748898 18.671835 12.418766 17.432077
31.431618 25.197182 13.587622 33.84362 37.0064 22.762499
21.182072 18.298859 16.805908 21.369957 25.80291 24.897501
17.08489 26.052343 11.837404 10.8254385 25.344805 27.223076
22.880053 13.774416 24.849585 19.550842 21.7954 22.697067
38.01279 11.396687 21.819597 37.327755 16.32067 16.062927
20.142687 17.874754 19.599241 21.31827 19.469305 30.204813
19.356852 24.60395 21.798048 29.356714 39.006306 19.91289
37.461536 37.036076 25.903673 45.908554 31.342543 20.151966 ]
import tensorflow as tf
import matplotlib.pyplot as plt
# 加载数据集(波士顿房价)
(train_data,train_label),(test_data,test_label) = tf.keras.datasets.boston_housing.load_data()
# 通过观察发现拥有404条训练数据并包含13个不同的特征属性
print(train_data.shape)
# 通过观察发现拥有102条训练数据并包含13个不同的特征属性
print(test_data.shape)
# 对数据进行处理(使用dataset初始数据)
ds_train = tf.data.Dataset.from_tensor_slices((train_data,train_label))
ds_test = tf.data.Dataset.from_tensor_slices((test_data,test_label))
# 设置对数据进行乱序+重复+批次处理
ds_train = ds_train.shuffle(train_label.shape[0]).repeat().batch(101)
ds_test = ds_test.batch(101)
# 建造模型
def build_model():
# 创建模型
model = tf.keras.Sequential([
# 输入层
tf.keras.layers.Dense(64,activation=tf.nn.relu,input_shape=(13,)),
# 隐藏层
tf.keras.layers.Dense(64,activation=tf.nn.relu),
# 输出层
tf.keras.layers.Dense(1)
])
# 编译模型
model.compile(
optimizer=tf.optimizers.Adam(0.001),
loss=tf.losses.mse,
metrics=['mae']
)
return model
# 绘图函数
def plot_history(history):
plt.subplot(2, 1, 1)
plt.title('loss')
plt.plot(history.epoch, history.history.get('loss'), label='loss')
plt.plot(history.epoch, history.history.get('val_loss'), label='val_loss')
plt.legend()
plt.subplot(2, 1, 2)
plt.title('mae')
plt.plot(history.epoch, history.history.get('mae'), label='mae')
plt.plot(history.epoch, history.history.get('val_mae'), label='val_mae')
plt.legend()
plt.show()
model = build_model()
print(model.summary())
# 每一个epoch的迭代次数
train_steps_per_epochs = train_data.shape[0] // 101
test_steps_per_epochs = test_data.shape[0] // 101
# 设置模型自动停止
early_stop = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=20)
# 训练模型(500个epoch)并放入history
history =model.fit(
ds_train,
epochs=500,
steps_per_epoch=train_steps_per_epochs,
validation_data=ds_test,
validation_steps=test_steps_per_epochs,
callbacks=[early_stop]
)
# 绘图
plot_history(history)
# 预测
test_predictions = model.predict(test_data).flatten()
print(test_predictions)