👩🏾‍✈️ ↙️ ✍🏽 制作神经网络：如何不伤大脑 👂🏽 🥙 ⛹🏻

哈Ha！

在这篇简短的文章中，我将向您介绍两个容易发生冲突且容易解决的陷阱。

这将是关于在Keras上创建一个琐碎的神经网络，我们将用它预测两个数字的算术平均值。

似乎更容易些。确实，没有什么复杂的，但是有细微差别。

对于这个主题很有趣的人，欢迎他们参加，不会有冗长乏味的描述，只是简短的代码和注释。

解决方案如下所示：

import numpy as np from keras.layers import Input, Dense, Lambda from keras.models import Model import keras.backend as K #   def train_iterator(batch_size=64): x = np.zeros((batch_size, 2)) while True: for i in range(batch_size): x[i][0] = np.random.randint(0, 100) x[i][1] = np.random.randint(0, 100) x_mean = (x[::,0] + x[::,1]) / 2 x_mean_ex = np.expand_dims(x_mean, -1) yield [x], [x_mean_ex] #  def create_model(): x = Input(name = 'x', shape=(2,)) x_mean = Dense(1)(x) model = Model(inputs=x, outputs=x_mean) return model #    model = create_model() model.compile(loss=['mse'], optimizer = 'rmsprop') model.fit_generator(train_iterator(), steps_per_epoch = 1000, epochs = 100, verbose = 1) #  x, x_mean = next(train_iterator(1)) print(x, x_mean, model.predict(x))

尝试学习...但是一无所获。在这里，您可以安排带铃鼓的舞蹈，并节省很多时间。

 Epoch 1/100 1000/1000 [==============================] - 2s 2ms/step - loss: 1044.0806 Epoch 2/100 1000/1000 [==============================] - 2s 2ms/step - loss: 713.5198 Epoch 3/100 1000/1000 [==============================] - 3s 3ms/step - loss: 708.1110 ... Epoch 98/100 1000/1000 [==============================] - 2s 2ms/step - loss: 415.0479 Epoch 99/100 1000/1000 [==============================] - 2s 2ms/step - loss: 416.6932 Epoch 100/100 1000/1000 [==============================] - 2s 2ms/step - loss: 417.2400 [array([[73., 57.]])] [array([[65.]])] [[49.650894]]

预测为49，而不是65。

但是，只要我们稍微重做生成器，一切便立即开始工作。

 def train_iterator_1(batch_size=64): x = np.zeros((batch_size, 2)) x_mean = np.zeros((batch_size,)) while True: for i in range(batch_size): x[i][0] = np.random.randint(0, 100) x[i][1] = np.random.randint(0, 100) x_mean[::] = (x[::,0] + x[::,1]) / 2 x_mean_ex = np.expand_dims(x_mean, -1) yield [x], [x_mean_ex]

显然，从字面上来看，网络已经在融合。

 Epoch 1/5 1000/1000 [==============================] - 2s 2ms/step - loss: 648.9184 Epoch 2/5 1000/1000 [==============================] - 2s 2ms/step - loss: 0.0177 Epoch 3/5 1000/1000 [==============================] - 2s 2ms/step - loss: 0.0030

主要区别在于，在第一种情况下，x_mean对象每次都在内存中创建，而在第二种情况下，它在创建生成器时出现，然后仅被重用。

我们进一步了解此生成器中的一切是否正确。事实并非如此。
以下示例显示出了问题。

 def train_iterator(batch_size=1): x = np.zeros((batch_size, 2)) while True: for i in range(batch_size): x[i][0] = np.random.randint(0, 100) x[i][1] = np.random.randint(0, 100) x_mean = (x[::,0] + x[::,1]) / 2 yield x, x_mean it = train_iterator() print(next(it), next(it))

(array([[44., 2.]]), array([10.])) (array([[44., 2.]]), array([23.]))

第一次迭代器调用中的平均值与计算该平均值所依据的数字不一致。实际上，平均值计算正确，但是因为该数组通过引用传递，第二次调用迭代器，该数组中的值被覆盖，并且print（）函数返回该数组中的内容，而不是我们所期望的。

有两种方法可以解决此问题。两者都很昂贵，但正确。
1.将变量x的创建移到while循环内，以便每次产生一个新数组。

 def train_iterator_1(batch_size=1): while True: x = np.zeros((batch_size, 2)) for i in range(batch_size): x[i][0] = np.random.randint(0, 100) x[i][1] = np.random.randint(0, 100) x_mean = (x[::,0] + x[::,1]) / 2 yield x, x_mean it_1 = train_iterator_1() print(next(it_1), next(it_1))

(array([[82., 4.]]), array([43.])) (array([[77., 34.]]), array([55.5]))

2.返回数组的副本。

 def train_iterator_2(batch_size=1): x = np.zeros((batch_size, 2)) while True: x = np.zeros((batch_size, 2)) for i in range(batch_size): x[i][0] = np.random.randint(0, 100) x[i][1] = np.random.randint(0, 100) x_mean = (x[::,0] + x[::,1]) / 2 yield np.copy(x), x_mean it_2 = train_iterator_2() print(next(it_2), next(it_2))

(array([[63., 31.]]), array([47.])) (array([[94., 25.]]), array([59.5]))

现在一切都很好。来吧

是否需要执行expand_dims？让我们尝试删除此行，新代码将如下所示：

 def train_iterator(batch_size=64): while True: x = np.zeros((batch_size, 2)) for i in range(batch_size): x[i][0] = np.random.randint(0, 100) x[i][1] = np.random.randint(0, 100) x_mean = (x[::,0] + x[::,1]) / 2 yield [x], [x_mean]

尽管返回的数据具有不同的形状，但是一切都会很好地学习。

例如，有[[49.]]，后来变成[49.]，但是显然在Keras内部，这正确地减小到了所需的尺寸。

因此，我们知道正确的数据生成器应该是什么样子，现在让我们开始使用lambda函数，并查看其中的expand_dims行为。

我们不会预测任何事情，我们只考虑lambda内部的正确值。

代码如下：

 def calc_mean(x): res = (x[::,0] + x[::,1]) / 2 res = K.expand_dims(res, -1) return res def create_model(): x = Input(name = 'x', shape=(2,)) x_mean = Lambda(lambda x: calc_mean(x), output_shape=(1,))(x) model = Model(inputs=x, outputs=x_mean) return model

我们开始，发现一切都很好：

 Epoch 1/5 100/100 [==============================] - 0s 3ms/step - loss: 0.0000e+00 Epoch 2/5 100/100 [==============================] - 0s 2ms/step - loss: 0.0000e+00 Epoch 3/5 100/100 [==============================] - 0s 3ms/step - loss: 0.0000e+00

现在，让我们尝试稍微修改lambda函数并删除expand_dims。

 def calc_mean(x): res = (x[::,0] + x[::,1]) / 2 return res

编译模型时，在尺寸上没有出现错误，但是结果已经不同，损失被认为是难以理解的。因此，这里expand_dims需要完成，什么也不会自动发生。

 Epoch 1/5 100/100 [==============================] - 0s 3ms/step - loss: 871.6299 Epoch 2/5 100/100 [==============================] - 0s 3ms/step - loss: 830.2568 Epoch 3/5 100/100 [==============================] - 0s 2ms/step - loss: 830.8041

并且，如果您看一看predict（）返回的结果，您会发现尺寸错误，输出为[46.]，并且预期为[[46.]]。

这样的东西。感谢所有阅读它的人。并且在细节上要小心，它们的作用可能很明显。

制作神经网络：如何不伤大脑

More articles: