Open access dataset used in the article
A Chronological Evolution Model for Crypto-Ransomware Detection based on Encrypted File-Sharing Traffic


This dataset contains the samples and the models trained and tested in this article. It also contains the directories used for generating the samples. It contains five separated compressed files:


The files containing samples are structured as follows:

The models can be loaded in a python script using keras. Some important considerations about them are explained in the following lines.

Neural Network model (NN)

The Neural Network model is composed by three hidden layers with 512, 256 and 128 cells. The input layer has 30 cells, and the output one has only 1 (binary classification).

The complete information about its structure is in NN.json, in the main repository's directory. The file was obtained by the command to_json() from the keras model.

Convolutional Neural Network model (CNN)

The Convolutional Neural Network model is composed by two convolutiona layers followed by two pooling layers and the last one unit dense layer for classify the binary sample.

The complete information about its structure is in CNN.json, in the main repository's directory. The file was obtained by the command to_json() from the keras model.

Long Short Term Memory models (LSTM)

ALl the Long Short Term Memory models compiled in this article has the same structure. They have the input layer and an additional hidden one, followed by the output layer that has only one cell.

As in previous cases, the complete information is in LSTM.json, in the main repository's directory. The file was obtained by the command to_json() from the keras model.

General considerations

In a prediction, each model gets a value between 0 and 1, instead of getting a binary output. Due to our classification problem is binary (two classes), we should set a threshold for the classifier output. After some experiments we considered that the best option is set the threshold to 0.99, because the false positives are much more problematic than the false negatives. All the experiments performed in the article has been performed with this threshold.