Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when attempting to access private repo #30

Open
jfredrickson5 opened this issue Apr 23, 2019 · 8 comments
Open

Error when attempting to access private repo #30

jfredrickson5 opened this issue Apr 23, 2019 · 8 comments

Comments

@jfredrickson5
Copy link
Collaborator

Attempting to run scraper on a GitHub org with private repos results in an error.

Output:

% scraper --config config.json                     
2019-04-23 17:29:12,536 - INFO: Connected to: https://github.com                                     
2019-04-23 17:29:12,773 - INFO: Processing: GSA/private-test                                         
Traceback (most recent call last):
  File "/home/jf/.pyenv/versions/3.7.0/bin/scraper", line 11, in <module>                            
    load_entry_point('llnl-scraper', 'console_scripts', 'scraper')()                                 
  File "/home/jf/gsa/scraper/scraper/gen_code_gov_json.py", line 76, in main                         
    code_json = code_gov.process_config(config_json)                                                 
  File "/home/jf/gsa/scraper/scraper/code_gov/__init__.py", line 58, in process_config               
    code_gov_project = Project.from_github3(repo, labor_hours=compute_labor_hours)                   
  File "/home/jf/gsa/scraper/scraper/code_gov/models.py", line 217, in from_github3                  
    elif date_parse(repository.created_at) < POLICY_START_DATE:                                      
  File "/home/jf/.pyenv/versions/3.7.0/lib/python3.7/site-packages/dateutil/parser/_parser.py", line 1356, in parse
    return DEFAULTPARSER.parse(timestr, **kwargs)
  File "/home/jf/.pyenv/versions/3.7.0/lib/python3.7/site-packages/dateutil/parser/_parser.py", line 645, in parse
    res, skipped_tokens = self._parse(timestr, **kwargs)                                             
  File "/home/jf/.pyenv/versions/3.7.0/lib/python3.7/site-packages/dateutil/parser/_parser.py", line 721, in _parse
    l = _timelex.split(timestr)         # Splits the timestr into tokens                             
  File "/home/jf/.pyenv/versions/3.7.0/lib/python3.7/site-packages/dateutil/parser/_parser.py", line 207, in split
    return list(cls(s))
  File "/home/jf/.pyenv/versions/3.7.0/lib/python3.7/site-packages/dateutil/parser/_parser.py", line 76, in __init__
    '{itype}'.format(itype=instream.__class__.__name__))                                             
TypeError: Parser must be a string or character stream, not datetime

Here is a simplified config.json as a test case. The GSA/private-test repo is private and contains a README.md file.

{
  "agency": "GSA",
  "contact_email": "[email protected]",
  "GitHub": [
    {
      "public_only": false,
      "repos": [
        "GSA/private-test"
      ]
    }
  ]
}

Example of a real config.json where we encountered the issue. It scans properly until it arrives at a private repo, at which point it crashes.

{
  "agency": "GSA",
  "contact_email": "[email protected]",
  "GitHub": [
    {
      "public_only": false,
      "orgs": [
        "GSA",
        "18F",
        "presidential-innovation-fellows",
        "USWDS"
      ],
    }
  ]
}

Verified that my GitHub access token is valid and can view private repos by using the same token for a different script.

@IanLee1521
Copy link
Member

Interesting... Can you post the output of pip list ? Specifically, I'm looking for version on github3.py

@jfredrickson5
Copy link
Collaborator Author

Here's pip list:

Package           Version    Location                                                                 
----------------- ---------- --------------------                                                     
asn1crypto        0.24.0                                                                              
certifi           2018.8.24                                                                           
cffi              1.12.3                                                                              
chardet           3.0.4                                                                               
cryptography      2.6.1                                                                               
decorator         4.3.0                                                                               
github3.py        1.2.0                                                                               
idna              2.7                                                                                 
isodate           0.6.0                                                                               
jwcrypto          0.6.0                                                                               
llnl-scraper      0.8.0.dev0 /home/jf/src/scraper                                                     
mock              2.0.0                                                                               
msrest            0.6.6                                                                               
oauthlib          3.0.1                                                                               
pbr               4.2.0                                                                               
pip               19.0.3                                                                              
pycparser         2.19                                                                                
python-dateutil   2.7.3                                                                               
python-gitlab     1.6.0                                                                               
requests          2.19.1                                                                              
requests-oauthlib 1.2.0                                                                               
setuptools        39.0.1                                                                              
six               1.11.0                                                                              
stashy            0.5
uritemplate       3.0.0
uritemplate.py    3.0.2
urllib3           1.23
virtualenv        16.1.0
vsts              0.1.25

@jfredrickson5
Copy link
Collaborator Author

Huh, now I'm super confused. I nuked my pyenv and started fresh. Now the repository.created_at property is a string and PR #32 no longer works for me.

I had this debugging output when I was working on the change: print("repository.created_at type: ", type(repository.created_at))

It previously output datetime and now it's str.

Here's my latest pip list:

Package           Version    Location
----------------- ---------- ---------------------
asn1crypto        0.24.0
astroid           2.2.5
certifi           2019.3.9
cffi              1.12.3
chardet           3.0.4
cryptography      2.6.1
decorator         4.4.0
github3.py        1.2.0
idna              2.8
isodate           0.6.0
isort             4.3.17
jwcrypto          0.6.0
lazy-object-proxy 1.3.1
llnl-scraper      0.8.0.dev0 /Users/jf/gsa/scraper
mccabe            0.6.1
mock              2.0.0
msrest            0.6.6
oauthlib          3.0.1
pbr               5.2.0
pip               19.1
pycparser         2.19
pylint            2.3.1
python-dateutil   2.8.0
python-gitlab     1.8.0
requests          2.21.0
requests-oauthlib 1.2.0
setuptools        40.8.0
six               1.12.0
stashy            0.6
typed-ast         1.3.5
uritemplate       3.0.0
urllib3           1.24.2
vsts              0.1.25
wrapt             1.11.1

Possibly user error due to a bad environment? No idea. I'm going to see if I can replicate it and if not, maybe we can close this.

@IanLee1521
Copy link
Member

Thanks for the additional information @jfredrickson5 .

FWIW, you're not crazy... I've seen very similar behavior. I think there is a package in the dependency chain that is changing it's behavior... I've thought about trying to add some exception handling there to "do the right thing" but haven't gotten that all the way yet. If you're interested in adding that, I'd welcome the addition!

@IanLee1521
Copy link
Member

@jfredrickson5 - I see you closed the MR, are you thinking that this is resolved too? Or did we still need to fix something?

@jfredrickson5
Copy link
Collaborator Author

@IanLee1521 I'm not sure what notification GitHub sent you, but I'm in the process of merging my separate personal and work GitHub accounts into one, so I think that must have unintentionally triggered something; I haven't actually made changes to this issue.

@IanLee1521
Copy link
Member

It was the the MR that got closed: #32

but ah, I see it was auto-closed by deleting a reference:

image

@jfredrickson5
Copy link
Collaborator Author

Ah, that was my personal fork that disappeared then. It's been a while so I don't know if the change is still valid, but feel free to grab the change and use it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants