We have reviewed several classes of data, but one of the most important—and challenging—types remains: network traffic data. Its significance stems from the fact that many attacks originate from network-based vectors.
Data collection, preparation, and cleaning take up to 80%–90% of time for an ML project. Therefore, it is not surprising that we need to cover it on multiple blogs. In the first part, we talked mainly about theory: how we collect data from our systems and how we store it.
You have probably heard this saying many times before: “Garbage in, garbage out”. This is a big deal in machine learning projects and even more in ML cybersecurity projects because the model can only be as good as the data you input to it.
At the beginning of our journey to ML and cybersecurity, we need to lay the foundations for a good development environment that fits our needs. This is the tale of three different environments: Google Colaboratory (Colab), Jupyter Server launched from your terminal, and VS Code.
Machine Learning (ML) and Cybersecurity, is this a match made in heaven or yet another tech hype? Indeed, this is an interesting combination worth exploring for the security professional who wants to solve their problems in a non-traditional manner or for the data scientist who wants to be involved in a significant impact area.
This is the third part of the telemetry stack introduction that introduces basic concepts of an alerting engine and how to implement these with Prometheus AlertManager. You can read this post in Introduction to a Telemetry Stack - Part 3
The third part of 'Intro to Pandas' discusses about Exploratory Data Analysis (EDA) of black box pcap data using the Pandas library. You can read this post in Intro to Pandas (Part 3) - Forecasting the network
The seconf part of 'Intro to Pandas' discusses about Exploratory Data Analysis (EDA) of black box pcap data using the Pandas library. You can read this post in Intro to Pandas (Part 2) - Exploratory data analysis for network traffic
In this blog I explain how you can use Jupyter notebooks with Poetry package management for Python. You can read this post in the Jupyter Notebooks for Development
Intro to Pandas library and how you can use it for network traffic data analysis. You can read this post in the NTC Blog Introduction to Pandas for Network Development