Skip to content

Commit

Permalink
Remove docker and switch to sqlite
Browse files Browse the repository at this point in the history
  • Loading branch information
msj committed Jun 4, 2024
1 parent 822d10f commit 7ad3110
Show file tree
Hide file tree
Showing 248 changed files with 1,060,802 additions and 138 deletions.
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,7 +1,11 @@
# Database
*.db

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
.DS_STORE

# C extensions
*.so
Expand Down Expand Up @@ -109,4 +113,4 @@ irsdb/metadata/migrations/
irsdb/return/migrations/


nohup.out
nohup.out
101 changes: 101 additions & 0 deletions 990-xml-reader/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# SageMath parsed files
*.sage.py

# dotenv
.env

# virtualenv
.venv
venv/
ENV/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
7 changes: 7 additions & 0 deletions 990-xml-reader/.gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
[submodule "irs_reader/metadata"]
path = irs_reader/metadata
url = ../990-xml-metadata.git
[submodule "metadata"]
path = metadata
url = ../990-xml-metadata.git

99 changes: 99 additions & 0 deletions 990-xml-reader/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# IRSX Change Log

All notable changes are documented in this file.

## 0.3.2 - 2022-07-19

Allow version 2020v1.1

## 0.3.1 - 2022-07-19

Removed IRSX' ability to retrieve filings from an S3 bucket.
Add irsx_retrieve to retrieve an entire year of filings from IRS' new location. This is experimental, it's not clear
how IRS will add additional zip files, or why the number per year is so inconsistent. Also modify the irsx index command to
retrieve index files from the new locations.





## 0.2.13 - 2022-05-31

Allow version 2020v4.2 to run. Xpaths newly added in 2020 (very few) are still unsupported, but everything else appears to work.


## 0.2.13 - 2021-07-21

Add experimental support for schemas for TY 2020. IRSx will process these now; gathering info about missing xpaths.

## 0.2.12 - 2020-10-23

Bugfix.

## 0.2.11 - 2020-10-23

Bugfix.


## 0.2.10 - 2020-10-23

Add support for 2018 schemas through 2018v3.2 and 2018v3.3. Add experimental support for 2019v5.0 through 2019v5.3.


## 0.2.9 - 2019-10-09

Add support for 2018 schemas through 2018v3.1

## 0.2.8 - 2019-09-15

Update metadata submodule

## 0.2.7 - 2018-08-22

Point to updated 990-xml-metadata repo, which includes xpaths for Tax Years 2017 and 2018.

## 0.2.6 - 2018-08-22

Allow version 2018v3.0 filings to be processed, these are also in production but were omitted in 0.2.4.


## 0.2.5 - 2018-08-20

Allow xml namespacing, as exhibited by 201940149349301304_public.

## 0.2.4 - 2018-08-20

Allow version 2018v3.1 filings to be processed. A few new metadata lines will need to be added once I've processed the .xsd files.


## 0.2.3

PR to accept file cache location as env var - https://github.com/jsfenfen/990-xml-reader/pull/20

## 0.2.2 - 2018-05-09

- Incorporate metadata changes; still need better approach

## 0.2.1 - 2018-05-04

- Updates in metadata to cover 2017; updates downstream.

## 0.2.0 - 2018-05-03

- Depend on metadata as a submodule instead of as a directory.
- Change in metadata; instead of a semicolon-delimited list of versions, files instead include version\_start and version\_end which includes the __year__ that a variable first appeared. The version_end is left blank unless the variable is no longer used.

## 0.1.1 - 2018-03-19

- Added the object_id to the .csv output
- Added IRSX\_SETTINGS\_LOCATION to allow: `from irs_reader.settings import IRSX_SETTINGS_LOCATION`

## 0.0.10 - 2018-03-15

- Initial release



## details

The format is based on [Keep a Changelog](http://keepachangelog.com/); with a goal of [semantic versioning](http://semver.org/).
21 changes: 21 additions & 0 deletions 990-xml-reader/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2017 Jacob Fenton

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
2 changes: 2 additions & 0 deletions 990-xml-reader/MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
include README.md
recursive-include irs_reader/metadata *
14 changes: 14 additions & 0 deletions 990-xml-reader/Pipfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"

[packages]
xmltodict = "*"
requests = "*"
unicodecsv = "*"

[dev-packages]

[requires]
python_version = "3.10"
76 changes: 76 additions & 0 deletions 990-xml-reader/Pipfile.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 7ad3110

Please sign in to comment.