Date of sale

Published

October 3, 2022

When Kylie started working for Nike, she didn’t believe her first project was at the core of their sales process.

She found a team working on a machine learning model for about a year with mediocre results. After a couple of weeks, Kylie proposed to do some feature engineering around a feature representing the sale date to help the model improve its predictions.

Which of the following are examples of feature engineering techniques that Kylie could do to improve the model?

  1. Replacing the feature representing the date of the sale with three separate columns for the year, month, and day.

  2. Replacing the feature representing the date of the sale with a single value containing the number of seconds since the year started.

  3. Replacing the feature representing the date of the sale with a single value that contains the number of the month.

  4. Keeping the date feature untouched and adding a new column representing the number of samples sold during the same month.

1, 2, 3, 4

We don’t know which of these options will be the most useful, but every one of them is an example of feature engineering that could help the model make better predictions.

The first three options are examples of how we can derive new features from a column in our dataset. In this case, Kylie can turn a date field into three components or simplify it by replacing it with a single value that keeps the necessary information for the model. Notice that, in these three cases, Kylie is not introducing anything new. Instead, she is transforming the data so the model can use it.

The fourth option is an example of frequency encoding, where Kylie counts how many products have been sold every month and creates a meta-feature with this information.

Recommended reading