summaryrefslogtreecommitdiff
path: root/flow.md
diff options
context:
space:
mode:
authorB Stack <bgstack15@gmail.com>2020-06-09 16:55:15 -0400
committerB Stack <bgstack15@gmail.com>2020-06-09 16:55:15 -0400
commit0c80c29d0fde63d9617d5769038963375e698628 (patch)
treeda4141376f5f0437ab5e659ebb4f8bdf69a9e0de /flow.md
downloadglip-0c80c29d0fde63d9617d5769038963375e698628.tar.gz
glip-0c80c29d0fde63d9617d5769038963375e698628.tar.bz2
glip-0c80c29d0fde63d9617d5769038963375e698628.zip
initial commit
Diffstat (limited to 'flow.md')
-rw-r--r--flow.md57
1 files changed, 57 insertions, 0 deletions
diff --git a/flow.md b/flow.md
new file mode 100644
index 0000000..5c81d5e
--- /dev/null
+++ b/flow.md
@@ -0,0 +1,57 @@
+#### Metadata
+Startdate: 2020-05-30 15:51
+References:
+Everything on this page, for jq filtering. https://stedolan.github.io/jq/manual/#Basicfilters
+
+
+# Flow
+
+1. Use gitlablib to list all issue web urls, and then remove all the "build", "buildmodify" and similar CI/CD issues.
+
+ . gitlablib.sh
+ list_all_issues | tee output/issues.all
+ <output/issues.all jq '.[]| if(.title|test("build-?(a(ll)?|mod(ify)?|add|del)?$")) then empty else . end | .web_url' | sed -r -e 's/"//g;' > output/issues.all.web_url
+
+ Manually munge the data to put the devuan/devuan-project/issues/20 on top.
+
+2. Use fetch-issue-webpages.py to fetch all those webpages
+
+ ln -s issues.all.web_url output/files-to-fetch.txt
+ ./fetch-issue-webpages.py
+
+3. munge the downloaded html
+ All of the following is performed by `flow-part2.sh`
+
+ * fix newlines
+
+ sed -i -r -e 's/\\n/\n/g;' /mnt/public/www/issues/*.html
+
+ * find data-original-titles and replace the <time> tag contents with the value of its data-original-title. Also, this will BeautifulSoup pretty-print the html so some of the following commands work correctly.
+
+ ls -1 /mnt/public/www/issues/*.html > output/files-for-timestamps.txt
+ ./fix-timestamps.py
+
+ * download all relevant images, and then fix them.
+
+ ./fetch-images.sh
+ sed -i -f fix-images-in-html.sed /mnt/public/www/issues/*.html
+
+ * download all stylesheets and then fix them.
+
+ mkdir -p /mnt/public/www/issues/css
+ ./fetch-css.sh
+ sed -i -f fix-css-in-html.sed /mnt/public/www/issues/*.html
+
+ * fix some encoding oddities
+
+ sed -i -f remove-useless.sed /mnt/public/www/issues/*.html
+
+ * remove html components that are not necessary
+
+ remove-useless.py
+
+ * Fix links that point to defunct domain without-systemd.org.
+
+ sed -i -r -f fix-without-systemd-links.sed /mnt/public/www/issues/*.html
+
+ * build some sort of index?
bgstack15