Skip to content

Releases: sherrisherry/cleandata

Thanksgiving

02 Dec 03:52
Compare
Choose a tag to compare

A collection of functions that work with data frame to inspect and manipulate data; and to keep track of data manipulation by producing log files.

Available on CRAN: https://cran.r-project.org/package=cleandata

Demonstration: Wrangling Ames Housing Dataset

New in V0.3.0

  • Made parameter 'log' able to take value from a 'log_arg' variable in the parent environment (dynamic scoping) of a function
    • The old way of assigning a value to 'log' is also supported
    • 'log' is the parameter to control producing log files

List of Functions

  • Inspection

    • inspect_map: Classify The Columns of A Data Frame
    • inspect_na: Find Out Which Columns Have Most NAs
    • inspect_smap: A Simplified Thus Faster Version of inspect_map
  • Imputation

    • impute_mean: Impute Missing Values by Mean
    • impute_median: Impute Missing Values by Median
    • impute_mode: Impute Missing Values by Mode
  • Encoding

    • encode_binary: Encode Binary Data Into 0 and 1
    • encode_ordinal: Encode Ordinal Data Into Integers
    • encode_onehot: One Hot encoding
  • Partitioning

    • partition_random: Partition A Dataset Randomly
  • Other

    • wh_dict: Create Data Dictionary from Data Warehouse

Labor Day

05 Sep 15:13
Compare
Choose a tag to compare

A collection of functions that work with data frame to inspect, impute, encode, and partition data. The functions for imputation, encoding, and partitioning can produce log files to help you keep track of the data manipulation process.

Available on CRAN (submission is scheduled to Sep 11 due to a CRAN vacation)

Demonstration: Wrangling Ames Housing Dataset

I planned to keep writing new demos and linking them in this Readme file.

List of Functions

  • Inspection

    • inspect_map: Classify The Columns of A Data Frame
    • inspect_na: Find Out Which Columns Have Most NAs
    • inspect_smap: A Simplified Thus Faster Version of inspect_map
  • Imputation

    • impute_mean: Impute Missing Values by Mean
    • impute_median: Impute Missing Values by Median
    • impute_mode: Impute Missing Values by Mode
  • Encoding

    • encode_binary: Encode Binary Data Into 0 and 1
    • encode_ordinal: Encode Ordinal Data Into Integers
    • encode_onehot: One Hot encoding
  • Partitioning

    • partition_random: Partition A Dataset Randomly

The 1st Version

30 Aug 01:39
Compare
Choose a tag to compare

A collection of functions that work with data frame to inspect, impute, and encode data. The functions for imputation and encoding can produce log files to help you keep track of the data manipulation process.

Available on CRAN: https://cran.r-project.org/package=cleandata

Demonstration: Wrangling Ames Housing Dataset

List of Functions

    Inspection
  • inspect_map: Classify The Columns of A Data Frame
  • inspect_na: Find Out Which Columns Have Most NAs
    Imputation
  • impute_mean: Impute Missing Values by Mean
  • impute_median: Impute Missing Values by Median
  • impute_mode: Impute Missing Values by Mode
    Encoding
  • encode_binary: Encode Binary Data Into 0 and 1
  • encode_ordinal: Encode Ordinal Data Into Integers