Even though I have never worked as a data engineer, I have been playing with machine learning for a while now. At first, I was more interested in the mathematics behind it; I even made a repository where I tried implementing it.

Being in the world of Bitcoin and, more generally, crypto-currencies for a while, I wondered if a complex algorithm could not be used to predict the price.

That happens to be harder than I thought.

After all the models I have done, I decided that I would try to prove the point that without very complex models, insight data, or a lot of money, you can't predict it; I came up to the conclusion that stock price and bitcoin price are just noise by itself. I believe it is possible to have more precise predictions, not just the price or the RSI.

How will I show you the difference between a predictable time series and a not-predictable one?

Using a predictable trigonometric function

Here it is :

The predictable time series I am going to use is a sum of sinus and cosinus

n = 10000

array = np.array([math.sin(i*0.02) + math.cos(i*.05) - math.sin(i*0.01) for i in range(1, n)])
fig, ax = plt.subplots()
ax.plot([i for i in range(1, n)], array, linewidth=0.75)
plt.show()

That gives us this curve :

Every algorithm you will find on the internet proposes you to predict the next value given the 150 last one. We are going to make something a bit different. Let's take all our values and associate each of them with a buy index, which would be 1 when the best action to do is to buy and 0 when the best move would be to sell.

We can do this with a simple algorithm :

SELL_INDEX = np.zeros((len(array), 1))

for index, row in enumerate(array):

    if index > len(array) - 150:
        continue

    max_price = np.amax(array[index:index + 150])
    min_price = np.amin(array[index:index + 150])

    current_sell_index = (row - min_price) / (max_price - min_price)

    SELL_INDEX[index][0] = 1 if current_sell_index > 0.8 else 0

data_with_sell_index = np.hstack((array.reshape(-1,1), SELL_INDEX))
data_final =  np.hstack( (data_with_sell_index,  np.arange(len(data_with_sell_index)).reshape(-1, 1)) )
data_final = data_final[:len(data_final) - 150]

Let's apply it to our sum of sinus and cosinus curve, and that is what is showing :

Ok, we are all set now. The idea of the model now would be to predict the nᵗʰ buy index, using the 150 previous prices, so from n - 150 to n -1.

The model looks like this :

input_layer = Input(shape=(150, 1))
layer_1_lstm = LSTM(50, return_sequences=True)(input_layer)
dropout_1 = Dropout(0.1)(layer_1_lstm)
layer_2_lstm = LSTM(50, return_sequences=True)(dropout_1)
dropout_2 = Dropout(0.1)(layer_2_lstm)
layer_3_lstm = LSTM(50)(dropout_2)

output_sell_index_proba = Dense(1, activation='sigmoid')(layer_3_lstm)

model = Model(inputs=input_layer, outputs=output_sell_index_proba)
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=[keras.metrics.BinaryAccuracy()])
model.summary()

As I am getting either 1 or 0 as a result, I decided to use the metric binaryAccuracy. and the loss function binary_crossentropy. The optimizer, Adam is often the best choice; it is the one that has given me the best results.

After training with ten epochs, I end up with about 0.03 loss and almost 0.99 accuracy.

Epoch: 8. Reducing Learning Rate from 0.0009227448608726263 to 0.000913517433218658
Epoch 10/10
125/125 [==============================] - 28s 221ms/step - loss: 0.0389 - binary_accuracy: 0.9853 - val_loss: 0.0310 - val_binary_accuracy: 0.9864

Let's now try to predict from new data unseen by the algorithm during the training.

data = np.array(data_final[:,0][9000:])
results = np.array([])
for i in range (150, 1000):
    result = model.predict(data[i - 150 : i].reshape(1, 150, 1))
    results = np.append(result, results)

Here is the chart :

The accuracy of unseen data is pretty good, even though there are some inconsistencies.

Using Bitcoin price

Let's plot the last 10000 CLOSE CANDLE on BTC (60S).

I will apply the previous algorithm I used for the trigonometric function, and here is the result.

So now I have everything to work with, I can reapply the same process I did in the previous section. The only difference is that I will use a MinMaxScaler, so the input value will only vary from 0 to 1; neural networks have a hard time working with input that vary lot (here between 20k and 40k).

scaler = MinMaxScaler(feature_range=(0, 1))
fitter = scaler.fit(x)
x = fitter.transform(x)

The first difference that I noticed is the loss; It is stuck at 0.5

Epoch: 8. Reducing Learning Rate from 0.000817907159216702 to 0.0008097281097434461
Epoch 10/10
125/125 [==============================] - 27s 218ms/step - loss: 0.5074 - binary_accuracy: 0.7952 - val_loss: 0.4372 - val_binary_accuracy: 0.8475

And here is a prediction chart :

There is nothing we can rely on.

The model does not find any pattern in the price; it is considered noise.

Conclusion

People have been trying to predict stock prices for the longest time; they have invented many ways to do it :

Technical analysis, like RSI, Ichimoku candle
Artificial inteligence technique
Sentiment analysis

I think that no one can accurately predict the stock market without a solid understanding of the asset or massive investments.

You can find the code on my GitHub repository: https://github.com/mathias-vandaele/keras-research.

Why can't we predict Bitcoin's price ?

Using a predictable trigonometric function

Using Bitcoin price

Conclusion

Comments

More from this blog

Why I Stopped Calling OSRM Over HTTP (and Wrote Rust Bindings Instead)

Static dispatch vs dynamic dispatch in Rust, how to dramatically improve performances + Java 21 bonus

An interesting way of solving this google foo bar challenge.

Implementing onion architecture using Rust

Managing exceptions in WebFlux using functional endpoint

Command Palette

Using a predictable trigonometric function

Using Bitcoin price

Conclusion

Comments

More from this blog