Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datasets #61

Merged
merged 4 commits into from
Dec 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: nonprobsvy
Type: Package
Title: Inference Based on Non-Probability Samples
Version: 0.1.1
Version: 0.1.2
Authors@R:
c(person(given = "Łukasz",
family = "Chrostowski",
Expand All @@ -22,7 +22,7 @@ Encoding: UTF-8
LazyData: true
RdMacros: mathjaxr
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3
RoxygenNote: 7.3.2
URL: https://github.com/ncn-foreigners/nonprobsvy, https://ncn-foreigners.github.io/nonprobsvy/
BugReports: https://github.com/ncn-foreigners/nonprobsvy/issues
Suggests:
Expand Down
9 changes: 8 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,16 @@
# nonprobsvy 0.1.2

------------------------------------------------------------------------

### Features
- two additional datasets have been included: `jvs` (Job Vacancy Survey; a probability sample survey) and `admin` (Central Job Offers Database; a non-probability sample survey). The units and auxiliary variables have been aligned in a way that allows the data to be integrated using the methods implemented in this package.

# nonprobsvy 0.1.1

------------------------------------------------------------------------

### Bugfixes
- bug Fix occuring when estimation was based on auxiliary variable, which led to compression of the data from the frame to the vector.
- bug Fix occurring when estimation was based on auxiliary variable, which led to compression of the data from the frame to the vector.
- bug Fix related to not passing `maxit` argument from `controlSel` function to internally used `nleqslv` function
- bug Fix related to storing `vector` in `model_frame` when predicting `y_hat` in mass imputation `glm` model when X is based in one auxiliary variable only - fix provided converting it to `data.frame` object.

Expand Down
58 changes: 58 additions & 0 deletions R/data.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
#' Job Vacancy Survey
#'
#' @description
#' This is a subset of the subset of the Job Vacancy Survey from Poland (for one quarter).
#' The data has been subject to slight manipulation, but the relationships in the data have been preserved.
#' For further details on the JVS, please refer to the following link:
#' \url{https://stat.gov.pl/obszary-tematyczne/rynek-pracy/popyt-na-prace/zeszyt-metodologiczny-popyt-na-prace,3,1.html}.
#'
#'
#' @format A single data.frame with 6,523 rows and 6 columns
#'
#' \describe{
#' \item{\code{id}}{Identifier of an entity (company: legal or local).}
#' \item{\code{private}}{Whether the company is a private (1) or public (0) entity.}
#' \item{\code{size}}{The size of the entity: S -- small (to 9 employees), M -- medium (10-49) or L -- large (over 49).}
#' \item{\code{nace}}{The main NACE code for a given entity: C, D.E, F, G, H, I, J, K.L, M, N, O, P, Q or R.S (14 levels, 3 combined: D and E, K and L, and R and S).}
#' \item{\code{region}}{The region of Poland (16 levels: 02, 04, ..., 32).}
#' \item{\code{weight}}{The final (calibrated) weight (w-weight). We do not have access to design weights (d-weights).}
#' }
#'
#' @docType data
#' @keywords datasets
#' @name jvs
#' @rdname jvs
#' @examples
#'
#' data("jvs")
#' head(jvs)
#'
"jvs"

#' Admin data (non-probability survey)
#'
#' @description
#' This is a subset of the Central Job Offers Database, a voluntary administrative data set (non-probability sample).
#' The data was slightly manipulated to ensure the relationships were preserved, and then aligned.
#' For more information about the CBOP, please refer to: \url{https://oferty.praca.gov.pl/}.
#'
#' @format A single data.frame with 9,344 rows and 6 columns
#'
#' \describe{
#' \item{\code{id}}{Identifier of an entity (company: legal or local).}
#' \item{\code{private}}{Whether the company is a private (1) or public (0) entity.}
#' \item{\code{size}}{The size of the entity: S -- small (to 9 employees), M -- medium (10-49) or L -- large (over 49).}
#' \item{\code{nace}}{The main NACE code for a given entity: C, D.E, F, G, H, I, J, K.L, M, N, O, P, Q or R.S (14 levels, 3 combined: D and E, K and L, and R and S).}
#' \item{\code{region}}{The region of Poland (16 levels: 02, 04, ..., 32).}
#' \item{\code{single_shift}}{Whether an entity seeks employees on a single shift.}
#' }
#'
#' @docType data
#' @keywords datasets
#' @name admin
#' @rdname admin
#' @examples
#'
#' data("admin")
#' head(admin)
"admin"
Binary file added data/admin.rda
Binary file not shown.
Binary file added data/jvs.rda
Binary file not shown.
32 changes: 32 additions & 0 deletions man/admin.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

34 changes: 34 additions & 0 deletions man/jvs.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions nonprobsvy.Rproj
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
Version: 1.0
ProjectId: 320f8c93-491e-4ea6-a604-c30218d09bf4

RestoreWorkspace: Default
SaveWorkspace: Default
Expand Down
Loading