Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tags generated, count 0 #2

Open
andy5995 opened this issue Oct 30, 2017 · 5 comments
Open

Tags generated, count 0 #2

andy5995 opened this issue Oct 30, 2017 · 5 comments

Comments

@andy5995
Copy link

Tags generated, count 0

It seems to be breaking before reading the tags.

After adding a little debug code

$ python3 tag_generator.py 
_posts/2017-10-29-robert_koch_institut.md
_posts/2017-10-29-कोगनीटिव_बिहेवियर_थरेपी.md
_posts/2017-10-27-a-canvas-of-the-minds.md
_posts/2017-10-28-side_effects_book_alison_bass.md
Tags generated, count 0

It never gets to here:

for tag in total_tags:
    tag_filename = tag_dir + tag + '.md'
    f = open(tag_filename, 'a')
+  print(tag_filename)

Maybe the format of my post file?

tags:
    - wordpress
    - personal_stories
    - collaborative
    - blogs
@qian256
Copy link
Owner

qian256 commented Apr 17, 2018

Did you still have the problem? That may be because there was not a folder called tag when the script is called. I fixed it in the new script.

@linotes
Copy link

linotes commented Aug 2, 2018

我把 _post 中的文件分别放到新建的子目录下以后就无法生成标签了,可否麻烦更新一下?谢谢!

@leucotic
Copy link

leucotic commented Nov 7, 2018

Hi, I am using jekyll to build my site and I have a similar problem, and in fact after running the script, it ended up deleting my existing manually-created tagname.md pages. I think the issue possibly has to do with the fact that it's looking through

post_dir = '_posts/'

however I want it to generate tags that are in other pages, in other places, i don't know if I could do it with site.pages or something? I do not know much about python. There may also be a different issue with it, when I tried moving some of the posts/pages from which I want to generate tags into the _posts/ directory, I got this error:

 File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 1882: ordinal not in range(128)

I don't think I have anything particularly weird in my YAML, only numbers, letters, :, ], /, -,

or does it go through the entire post? I'm not sure how to deal with this, or if my analysis is completely wrong. Any advice would be appreciated!

@AleksandrHovhannisyan
Copy link

For me, the reason this was happening is because my _pages/ directory organizes blog posts into subfolders based on the primary category they fall into:

image

The script as it is written will only work if all of your blog posts are dumped in the _pages directory. If you want to also traverse all nested subdirectories and process those blog posts, use this:

for dir_name, subdir_list, file_list in os.walk(post_dir):
    for file in file_list:
        f = open(os.path.join(dir_name, file), 'r', encoding='utf-8')
        crawl = False
       # rest of the script

@5nizza
Copy link

5nizza commented Oct 30, 2020

to add to the @AleksandrHovhannisyan comment, here is the code that supports subdirectories as well as the list of tags specified with tags (so you can write things like tags: [one, two, 'first tag', 'second tag']):

import glob
import os

post_dir = '_posts/'
tag_dir = 'tag/'

file_names = glob.glob(post_dir + '**/*.md', recursive=True)

tags = set()
for file in file_names:
    f = open(file, 'r')
    inside_header = False
    for line in f:
        line = line.strip()
        if line == '---':
            if inside_header:
                break  # continue to the next file
            inside_header = True
        if line.startswith('tags:'):
            tags_token = line[5:].strip()
            if tags_token.startswith('['):
                tags_token = tags_token.strip('[]')
                new_tags = [l.strip().strip(" "+"'"+'"')
                            for l in tags_token.split(',')]
            else:
                new_tags = tags_token.split()
            tags.update(new_tags)
    f.close()

old_tags = glob.glob(tag_dir + '*.md')
for tag in old_tags:
    os.remove(tag)

if not os.path.exists(tag_dir):
    os.makedirs(tag_dir)

for tag in tags:
    tag_filename = tag_dir + tag + '.md'
    f = open(tag_filename, 'a')
    write_str = '---\nlayout: tagpage\ntitle: \"Tag: ' + tag + '\"\ntag: ' + tag + '\nrobots: noindex\n---\n'
    f.write(write_str)
    f.close()

print("Tags generated ({count}): {tags}".format(count=len(tags),
                                                tags=', '.join(tags)))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants