TypeCodes

更新博客生成发布及同步GitHub的Shell脚本:解决文件名空格问题

前面一篇文章通过Linux Shell脚本的方式,实现博客生成发布及同步GitHub的的功能:先同步GitHub个人仓库中的Markdown文章到本地,然后通过Pelican编译生成静态HTML文件,最后发布到Nginx的web目录下面,同时更新到GitHub个人主页(vfhky.github.io)。

由于没仔细考虑到Markdown文件名中可能包含空格的问题,所以在Shell脚本的文件名遍历时(代码第108行)出现了BUG:Linux Shell默认把空格空格做为值与值之间的分隔符,所以原本一个带空格的文件名就被拆分成了几个文件名。

CentOS7.2服务器默认的IFS为空值

1 解决方法

网上查找相关资料后,有两种主要的实现方法:Method 1是通过修改IFS(Internal Field Seperator: Linuxshell中预设的分隔符,用来把command line分解成word)实现。如上图所示,BZ用命令echo $IFS查看了自己的CentOS7.2服务器默认的IFS为空值。

Method 2是直接在使用find命令遍历文件时,把结果作为while read的输入,这样就可以避免空格作为文件名的分隔符了。

2 完整代码

根据这两种方法,修改后的代码如下,同时也已经更新到了当前脚本所在的GitHub工程:https://github.com/vfhky/shell-tools

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
#!/bin/bash
# FileName:      github_pelican_nginx.sh
# Description:   Synchronize markdown articles with github, convert to html files using Pelican, deliver it to nginx environment.
# Simple Usage:  ./github_pelican_nginx.sh "commit_comments"
# Crontab Usage: 00 01 * * * /mydata/backups/bak_list/github_pelican_nginx.sh >/dev/null 2>&1
# (c) 2016 vfhky https://typecodes.com/linux/syngithubmarkdownpelican.html
# https://github.com/vfhky/shell-tools/blob/master/synchronize/github_pelican_nginx.sh


# Basic command.
FINDCMD="find"
MVCMD="\mv -f"
CPCMD="\cp -rf"
RMCMD="\rm -rf"
TARXCMD="tar -zxf"
TARZIPCMD="tar --warning=no-file-changed -zcf"

# Pelican compile markdown files to html.
PELICAN_COMPILE_DIR=/mydata/GitBang/pelican
# Private bang in github for store your markdown files.
GITHUB_PELICAN_DIR=/mydata/GitBang/GitHub/BlogBak
# Backup dir for your website's version management.
PELICAN_TAR_DIR=/usr/share/nginx/html/pelican_content_bak
# Dir of your website in nginx server.
PELICAN_BLOG_DIR=/usr/share/nginx/html/pelican
# Dir for this shell script to generate logs automatically.
BLOG_PUBLISH_LOG_DIR=/mydata/backups/logs/blogpublish
# Your personal homepage in github.
GITHUB_PERSONAL_PAGE=/mydata/GitBang/GitHub/vfhky.github.io
# Articles in 15 minutes are legal.
TIME_GAP=15

# Get the newest file name.
#Newest_File="ls -lrt| tail -n 1 | awk '{print $9}'"

# Name of this shell script.
PRGNAME="github_pelican_nginx"

# Current date format: e.g 20150505_2015.
Current_Date=$(date +%Y%m%d_%H%M)

# Check if current user is root.
[ "$(id -u)" != "0" ] && echo "Error: You must be root to run this script." && exit 1

# Check parameter.
if [ $# -gt 1 ]; then
    echo "Usage:    ./github_pelican_nginx.sh \"commit_comments\"" && exit 1
fi

# Run command functions.
function ERROR() {
    echo >/dev/null && echo "[$(date +%H:%M:%S:%N)][error] $*" >> ${BLOG_PUBLISH_LOG_DIR}/${Current_Date}.log
    exit 1
}

function NOTICE() {
    echo >/dev/null && echo "[$(date +%H:%M:%S:%N)][notice] $*" >> ${BLOG_PUBLISH_LOG_DIR}/${Current_Date}.log
}

function RUNCMD() {
    echo "[$(date +%H:%M:%S:%N)][notice] $*" >> ${BLOG_PUBLISH_LOG_DIR}/${Current_Date}.log
    eval $@
}

# Git pull command function.
function Git_Pull(){
    RUNCMD "git pull origin master >/dev/null"
}

# Git commit command function.
function Git_Commit(){
    if [ $# -ne 1 ]; then
        ERROR "Usage: Git_Commit commit_comments!"
        exit 1;
    else
        RUNCMD "git pull && git add --all && git commit -m \"$1\" && git push origin master"
    fi
}

# Get the path of markdown articles in TIME_GAP minutes.
# function Get_Files_Path(){
#    RUNCMD "${FINDCMD} . -mmin -${TIME_GAP} -type f -name \"*.md\" -print0"
# }

# Lock down permissions.You should be careful when it comes to your website for the permission of files, but it's safe using 022.
# umask 022

# Create the log dir.
if [ ! -d $BLOG_PUBLISH_LOG_DIR ]; then
    mkdir -p $BLOG_PUBLISH_LOG_DIR
fi


# Main process begin.
NOTICE "[1]Start pull from GitHub."
RC=0
RUNCMD "cd ${GITHUB_PELICAN_DIR}/md_article && Git_Pull"

RC=$?
if [ $RC -gt 0 ]; then
    ERROR "Git pull failed!"
fi


#### Method 1:Chang the IFS to del with the blank word in the filename.
#old_IFS=$IFS
#IFS=$(echo -en "\n\b")

NOTICE "[2]Start copy the pulled articles to the compile dir of PELICAN."
# New_Article_Files=$(Get_Files_Path ${GITHUB_PELICAN_DIR}/md_article)
# You should not delete the double quotation marks in case of existing a blank in the file path.
#### Method 1: for New_Article_File in `${FINDCMD} . -mmin -${TIME_GAP} -type f -name "*.md"`
#### Method 2: use the while recycle.
${FINDCMD} . -mmin -${TIME_GAP} -type f -name "*.md"|while read New_Article_File
do
    if [ -z "${New_Article_File}" ]; then
        echo "No articles, nothing to do."
        ERROR "No articles, nothing to do."
    fi
    FILE_PATH=$(dirname ${PELICAN_COMPILE_DIR}/content/articles/"${New_Article_File:2}")
    RUNCMD "mkdir -p ${FILE_PATH} && ${CPCMD} \"${New_Article_File}\" ${FILE_PATH}"
done

#### Method 1:Recovery the IFS setting.
#IFS=${old_IFS}

RC=$?
if [ $RC -gt 0 ]; then
    ERROR "Copy the pulled articles failed!"
fi


NOTICE "[3]Start compile in pelican."
RUNCMD "cd ${PELICAN_COMPILE_DIR} && make publish > /dev/null"

RC=$?
if [ $RC -gt 0 ]; then
    ERROR "Compile in pelican failed!"
fi


NOTICE "[4]Start generate a tar packgage and move it to the backup dir."
# The command of tar cause the problem that file changed as we read with the value 1, so we should ignore it using OR logic.
RUNCMD "cd ${PELICAN_COMPILE_DIR}/output && ${TARZIPCMD} ${Current_Date}.tar.gz . || ${MVCMD} ${Current_Date}.tar.gz ${PELICAN_BLOG_DIR}"

RC=$?
if [ $RC -gt 0 ]; then
    ERROR "Generate a tar packgage failed!"
fi


NOTICE "[5]Start unpack the target files."
RUNCMD "cd ${PELICAN_BLOG_DIR} && ${TARXCMD} ${Current_Date}.tar.gz && ${MVCMD} ${Current_Date}.tar.gz ${PELICAN_TAR_DIR}"

RC=$?
if [ $RC -gt 0 ]; then
    ERROR "Unpack the target files failed!"
fi

# if [ $# -eq 1 ]; then
if [ -n "$1" ]; then
    echo "Ready to synchronize to the homepage on github.com."
    NOTICE "[6]Start copy the packgage to the local homepage bang cloned from remote in GitHub."
    RUNCMD "${CPCMD} ${PELICAN_TAR_DIR}/${Current_Date}.tar.gz ${GITHUB_PERSONAL_PAGE} && cd ${GITHUB_PERSONAL_PAGE} && ${TARXCMD} ${Current_Date}.tar.gz && ${RMCMD} ${Current_Date}.tar.gz"

    RC=$?
    if [ $RC -gt 0 ]; then
        ERROR "Copy the packgage to the local homepage bang failed!"
    fi

    NOTICE "[7]Start synchronize website to my homepage on GitHub."
    # read -p "Please input your comments on this commitment: " COMMIT_COMMENTS
    # while [[ -z "${COMMIT_COMMENTS}" ]]
    # do
    #   read -p "Comments can not be empty.Please input again: " COMMIT_COMMENTS
    # done
    # RUNCMD "Git_Commit \"${COMMIT_COMMENTS}\""
    RUNCMD "Git_Commit \"$1\""

    RC=$?
    if [ $RC -gt 0 ]; then
        ERROR "Synchronize website to GitHub failed!"
    fi
else
    echo "Not synchronize your weibsite to the homepage on github.com."
    NOTICE "[6]Not synchronize your weibsite to the homepage on github.com."
fi


NOTICE "------END------"
exit 0

3 脚本执行结果

github_pelican_nginx脚本执行结果

Comments »