Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Error need help] Enrich data error about pymysql & enrich_onion():"alias" #3

Closed
ME-Msc opened this issue Sep 5, 2022 · 4 comments
Closed

Comments

@ME-Msc
Copy link

ME-Msc commented Sep 5, 2022

I have met two errors after modify the setup-gitee.cfg.
Before modifying the section [phases], it could collect raw data from gitee correctly.
I could not fix it. Could you please help me? Thanks.

Here is my setup-gitee.cfg

[general]
short_name = GrimoireLab
update = true
min_update_delay = 86400
debug = true
logs_dir = /tmp/logs
menu_file = ./menu.yaml
aliases_file = ./aliases.json

[projects]
projects_file = ./projects-gitee-msc.json

[es_collection]
url = http://localhost:9200

[es_enrichment]
url = http://localhost:9200

[sortinghat]
host = 127.0.0.1
user = root
password = root
database = demo_sh
load_orgs = true
orgs_file = ./organizations.json
autoprofile = [gitee,git,github]
matching = [email]
sleep_for = 100
unaffiliated_group = Unknown
affiliate = true
strict_mapping = false
reset_on_load = false
identities_file = [./identities.yml]
identities_format = grimoirelab

[panels]
kibiter_default_index = git
kibiter_url = http://localhost:5601
community = true
github-repos = true

[phases]
collection = true
identities = true
enrichment = true
panels = false

[git]
raw_index = git_demo_raw
enriched_index = git_demo_enriched
latest-items = true
category = commit
# studies = [enrich_demography:git, enrich_areas_of_code:git, enrich_onion:git]
studies = [enrich_demography:git, enrich_onion:git]
from-date = 2022-01-01

[enrich_demography:git]

# [enrich_areas_of_code:git]
# in_index = git_demo_raw
# out_index = git-aoc_demo_enriched

[enrich_onion:git]
in_index = git
out_index = git-onion_demo_enriched
contribs_field = hash

[gitee]
raw_index = gitee_issues-raw
enriched_index = gitee_issues-enriched
category = issue
api-token = <MY-TOKEN>
sleep-for-rate = true
no-archive = true
studies = [enrich_onion:gitee-issue]
from-date = 2022-01-01

[enrich_onion:gitee-issue]
in_index = gitee_issues-enriched
out_index = gitee_issues_onion-enriched
data_source = gitee_issues

[gitee:pull]
raw_index = gitee_pulls-raw
enriched_index = gitee_pulls-enriched
category = pull_request
api-token = <MY-TOKEN>
sleep-for-rate = true
no-archive = true
studies = [enrich_onion:gitee-pull]
from-date = 2022-01-01

[enrich_onion:gitee-pull]
in_index = gitee_pulls-enriched
out_index = gitee_pulls_onion-enriched
data_source = gitee_pulls

[gitee:repo]
raw_index = gitee_repo-raw
enriched_index = gitee_repo-enriched
category = repository
no-archive = true
api-token = <MY-TOKEN>
sleep-for-rate = true
from-date = 2022-01-01

[github:issue]
raw_index = github_raw
enriched_index = github_enriched
api-token = 
category = issue
sleep-for-rate = true
no-archive = true
studies = [enrich_onion:github]
from-date = 2022-01-01

[enrich_onion:github]
in_index_iss = github_enriched
in_index_prs = github-pull_enriched
out_index_iss = github-issues-onion_enriched
out_index_prs = github-prs-onion_enriched

[github:pull]
raw_index = github-pull_raw
enriched_index = github-pull_enriched
api-token = 
category = pull_request
sleep-for-rate = true
no-archive = true
from-date = 2022-01-01

[github:repo]
raw_index = github-repo_raw
enriched_index = github-repo_enriched
api-token = 
category = repository
sleep-for-rate = true
no-archive = true
from-date = 2022-01-01

Errors:

  1. error about pymysql, it is only on shell but not in log file.
(ivenv) msc@Grimoirelab:~/giteeFiles/grimoirelab/default-grimoirelab-settings$ sirmordred -c setup-gitee.cfg 
Exception in thread gitee:pull:
Traceback (most recent call last):
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
    self.dialect.do_execute(
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
    cursor.execute(statement, parameters)
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/pymysql/cursors.py", line 170, in execute
    result = self._query(query)
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/pymysql/cursors.py", line 328, in _query
    conn.query(q)
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/pymysql/connections.py", line 517, in query
    self._affected_rows = self._read_query_result(unbuffered=unbuffered)
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/pymysql/connections.py", line 732, in _read_query_result
    result.read()
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/pymysql/connections.py", line 1075, in read
Exception in thread git:
Traceback (most recent call last):
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
    self.dialect.do_execute(
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
    cursor.execute(statement, parameters)
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/pymysql/cursors.py", line 170, in execute
    result = self._query(query)
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/pymysql/cursors.py", line 328, in _query
    conn.query(q)
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/pymysql/connections.py", line 517, in query
    self._affected_rows = self._read_query_result(unbuffered=unbuffered)
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/pymysql/connections.py", line 732, in _read_query_result
    first_packet = self.connection._read_packet()
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/pymysql/connections.py", line 684, in _read_packet
    packet.check_error()
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/pymysql/protocol.py", line 220, in check_error
Exception in thread gitee:repo:
Exception in thread Global tasks:
Traceback (most recent call last):
    result.read()
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/pymysql/connections.py", line 1075, in read
    err.raise_mysql_exception(self._data)
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/pymysql/err.py", line 109, in raise_mysql_exception
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
Traceback (most recent call last):
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
    first_packet = self.connection._read_packet()
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/pymysql/connections.py", line 684, in _read_packet
    self.dialect.do_execute(
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
    self.dialect.do_execute(
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
    cursor.execute(statement, parameters)
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/pymysql/cursors.py", line 170, in execute
    packet.check_error()
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/pymysql/protocol.py", line 220, in check_error
    raise errorclass(errno, errval)
pymysql.err.InternalError: (1050, "Table 'organizations' already exists")
  1. error in log file about enrich_onion():
2022-09-05 13:22:37,827 - sirmordred.task_enrich - DEBUG - [gitee] All studies: ['enrich_onion']
2022-09-05 13:22:37,827 - sirmordred.task_enrich - DEBUG - [gitee] Configured studies ['enrich_onion:gitee-issue']
2022-09-05 13:22:37,827 - sirmordred.task_enrich - INFO - [gitee] Executing studies ['enrich_onion:gitee-issue']
2022-09-05 13:22:37,847 - grimoire_elk.elk - INFO - [gitee] Starting study: enrich_onion:gitee-issue, params {'in_index': 'gitee_issues-enriched', 'out_index': 'gitee_issues_onion-enriched', 'data_source': 'gitee_issues'}
2022-09-05 13:22:37,847 - grimoire_elk.elk - ERROR - [gitee] Problem executing study enrich_onion:gitee-issue, enrich_onion() missing 1 required positional argument: 'alias'
2022-09-05 13:22:37,847 - sirmordred.task_manager - ERROR - [gitee] Exception in Task Manager enrich_onion() missing 1 required positional argument: 'alias'
Traceback (most recent call last):
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/sirmordred/task_manager.py", line 99, in run
    task.execute()
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/sirmordred/task_enrich.py", line 447, in execute
    raise e
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/sirmordred/task_enrich.py", line 437, in execute
    self.__studies(retention_time)
  File "/home/msc/giteeFiles/ivenv/lib/python3.8/site-packages/sirmordred/task_enrich.py", line 350, in __studies
    do_studies(ocean_backend, enrich_backend, studies_args, retention_time=retention_time)
  File "/home/msc/giteeFiles/repos/grimoirelab-elk/grimoire_elk/elk.py", line 459, in do_studies
    raise e
  File "/home/msc/giteeFiles/repos/grimoirelab-elk/grimoire_elk/elk.py", line 456, in do_studies
    study(ocean_backend, enrich_backend, **params)
  File "/home/msc/giteeFiles/repos/grimoirelab-elk-gitee/grimoire_elk_gitee/enriched/gitee.py", line 504, in enrich_onion
    super().enrich_onion(enrich_backend=enrich_backend,
TypeError: enrich_onion() missing 1 required positional argument: 'alias'
2022-09-05 13:22:37,852 - sirmordred.sirmordred - ERROR - <class 'TypeError'>

Question:
I have noticed there is a in_index=git. I am confued about where the index "git" come from. Is it enriched from index "git_demo_raw"?

Thanks for your help!

@shanchenqi
Copy link
Collaborator

Hi, @ME-Msc, sorry for late reply. Does Q1 still appear when you restart sirmorded? If so, could you please try the method here?
For Q2, git index is an alias for 'git_demo_enriched', we created it for using it directly in the grimoirelab, see details here.

eyehwan added a commit that referenced this issue Sep 9, 2022
add Gitee2 to analyze issue and pr comments
@ME-Msc
Copy link
Author

ME-Msc commented Sep 12, 2022

Hi, @shanchenqi , thanks for your help! I have fixed Q1 with your help.

What about the error "missing 1 required positional argument: 'alias' " mentioned above? (It's between Q1 & Q2 you called.)

For Q2, I have noticed the alias file before. But there is no 'git_demo_enriched' in it. Did you mean 'git_enrich' here?

@eyehwan
Copy link
Member

eyehwan commented Sep 20, 2022

For Q2, I have noticed the alias file before. But there is no 'git_demo_enriched' in it. Did you mean 'git_enrich' here?

Yes, correct.

@ME-Msc
Copy link
Author

ME-Msc commented Sep 22, 2022

Thank you. @eyehwan I had realized that I could fix the error by skipping [enrich_onion:gitee_xxx].

@ME-Msc ME-Msc closed this as completed Sep 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants