Archive for April, 2018

Ihab Ilyas

April 18, 2018

Data cleaning is a machine learning problem that needs data systems help!

When dealing with real-world data, dirty data is the norm rather than the exception. We continuously need to predict correct values, impute missing ones, and find links between various data artefacts such as schemas and records. We need to stop treating data cleaning as a piecemeal exercise (resolving different types of errors in isolation), and […]

Archive for April, 2018

Ihab Ilyas

Data cleaning is a machine learning problem that needs data systems help!

Categories

Recent Comments

Archives