index
:
newspipe
add-reverse-proxy
combine-apps
hide-registration-button
ldap-auth
master
reverse-proxy
reverse-proxy-for-patch
stackrpms-branding
stackrpms-master
A web news aggregator.
bgstack15
about
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
pyaggr3g470r
/
crawler.py
Commit message (
Expand
)
Author
Age
*
Full text seaerch with Whoosh has been removed.
Cédric Bonhomme
2015-04-22
*
Removed debug print.
Cédric Bonhomme
2015-04-08
*
The minimum error count is now specified in the configuration file.
Cédric Bonhomme
2015-04-08
*
Better handling of the error logging in the crawler.
Cédric Bonhomme
2015-03-08
*
Disable the feed when more than 2 erros (test).
Cédric Bonhomme
2015-03-05
*
Minor update to the 'feed' template.
Cédric Bonhomme
2015-03-05
*
Take advantage of some new fields of the Feed objects.
Cédric Bonhomme
2015-03-05
*
Test with the old crawler (temporary during the transition).
Cédric Bonhomme
2015-03-04
*
unused import.
Cédric Bonhomme
2015-02-22
*
Indexation is now restored.
Cédric Bonhomme
2015-02-22
*
bug fix...
Cédric Bonhomme
2015-02-22
*
Prevents BeautifulSoup4 from adding extra <html><body> tags to the soup with ...
Cédric Bonhomme
2015-02-22
*
This test will be used for some weeks in order to avoid duplicates with the n...
Cédric Bonhomme
2015-02-19
*
It is now unseless to test the value of article.date at this point.
Cédric Bonhomme
2015-02-19
*
Alembic is magic!
Cédric Bonhomme
2015-02-18
*
Minor changes in the crawler (test of asyncio.async).
Cédric Bonhomme
2015-02-12
*
Time to sleep.
Cédric Bonhomme
2015-02-11
*
Some minor improvements concerning the parsing of the article publication date.
Cédric Bonhomme
2015-02-11
*
In the case it is not possible to resolve the URL of an article we just ignor...
Cédric Bonhomme
2015-02-11
*
Oh my god.
Cédric Bonhomme
2015-02-11
*
Fixed an other bug in the new crawler...
Cédric Bonhomme
2015-02-11
*
bug when the list of feeds to fetch is empty
Cédric Bonhomme
2015-02-09
*
Misc improvements for the crawler. A semaphore is used to limit the number of...
Cédric Bonhomme
2015-02-08
*
Fetch all feeds of the list (not only the first 20 feeds).
Cédric Bonhomme
2015-02-04
*
Get the feeds with aiohttp.
Cédric Bonhomme
2015-02-04
*
Test if we effectively have retrieved some articles.
Cédric Bonhomme
2015-01-21
*
clean_url is now working with Python3
Cédric Bonhomme
2015-01-21
*
Misc fixes to the crawler.
Cédric Bonhomme
2015-01-21
*
Added link to examples.
Cédric Bonhomme
2015-01-21
*
First implementation with asyncio (not really async for the moment).
Cédric Bonhomme
2015-01-21
*
Updated years f copyright.
Cédric Bonhomme
2015-01-03
*
Updated comment.
Cédric Bonhomme
2015-01-02
*
Hack: Re-add sslwrap to Python 2.7.9.
Cédric Bonhomme
2015-01-02
*
Import urlib.request for Python 3.
Cédric Bonhomme
2015-01-02
*
Notifications functions and functions to send emails are now in separated files.
Cédric Bonhomme
2014-11-20
*
Updated some comments.
Cédric Bonhomme
2014-11-19
*
When the title is not found.
Cédric Bonhomme
2014-11-09
*
Log 'bozo' exception.
Cédric Bonhomme
2014-11-08
*
Configuration variables has been updated.
Cédric Bonhomme
2014-08-18
*
minor changes.
Cédric Bonhomme
2014-07-13
*
Timeout of 5 seconds for all sockets.
Cédric Bonhomme
2014-07-13
*
Update headers tags.
Cédric Bonhomme
2014-07-13
*
Performance improvement for the crawler (database insertion step).
Cédric Bonhomme
2014-07-13
*
Minor improvemnts for the crawler.
Cédric Bonhomme
2014-07-13
*
if the crawler is not able to get the link of the article, continue.
Cédric Bonhomme
2014-06-21
*
fixes #7
Cédric Bonhomme
2014-06-10
*
making pyagregator runnable by apache
François Schmidts
2014-06-09
*
supporting feed without date or with ill formated date
François Schmidts
2014-06-08
*
Removed unused variable.
Cédric Bonhomme
2014-05-03
*
keep the original title.
Cédric Bonhomme
2014-05-03
[next]
bgstack15