Datenvorverarbeitung (EN)

Concept

Preparation and cleaning of raw data for machine learning

Data flow in data preprocessing

flowchart TD     A[Raw data] --> B[Data cleaning]     B --> C[Data transformation]     C --> D[Feature engineering]     D --> E[Data splitting]     E --> F[Training data]     E --> G[Validation data]     E --> H[Test data] 

In context

  • Typically used together with data exploration and model validation
  • Related to: Feature Selection, Data Augmentation, Data Quality
  • Example use: Normalization of pixel values in image recognition tasks
Quelle: AI Generated