aboutsummaryrefslogtreecommitdiff
path: root/pyaggr3g470r/crawler.py
Commit message (Collapse)AuthorAge
* bug when the list of feeds to fetch is emptyCédric Bonhomme2015-02-09
|
* Misc improvements for the crawler. A semaphore is used to limit the number ↵Cédric Bonhomme2015-02-08
| | | | of simultaneous connection.
* Fetch all feeds of the list (not only the first 20 feeds).Cédric Bonhomme2015-02-04
|
* Get the feeds with aiohttp.Cédric Bonhomme2015-02-04
|
* Test if we effectively have retrieved some articles.Cédric Bonhomme2015-01-21
|
* clean_url is now working with Python3Cédric Bonhomme2015-01-21
|
* Misc fixes to the crawler.Cédric Bonhomme2015-01-21
|
* Added link to examples.Cédric Bonhomme2015-01-21
|
* First implementation with asyncio (not really async for the moment).Cédric Bonhomme2015-01-21
|
* Updated years f copyright.Cédric Bonhomme2015-01-03
|
* Updated comment.Cédric Bonhomme2015-01-02
|
* Hack: Re-add sslwrap to Python 2.7.9.Cédric Bonhomme2015-01-02
|
* Import urlib.request for Python 3.Cédric Bonhomme2015-01-02
|
* Notifications functions and functions to send emails are now in separated files.Cédric Bonhomme2014-11-20
|
* Updated some comments.Cédric Bonhomme2014-11-19
|
* When the title is not found.Cédric Bonhomme2014-11-09
|
* Log 'bozo' exception.Cédric Bonhomme2014-11-08
|
* Configuration variables has been updated.Cédric Bonhomme2014-08-18
|
* minor changes.Cédric Bonhomme2014-07-13
|
* Timeout of 5 seconds for all sockets.Cédric Bonhomme2014-07-13
|
* Update headers tags.Cédric Bonhomme2014-07-13
|
* Performance improvement for the crawler (database insertion step).Cédric Bonhomme2014-07-13
|
* Minor improvemnts for the crawler.Cédric Bonhomme2014-07-13
|
* if the crawler is not able to get the link of the article, continue.Cédric Bonhomme2014-06-21
|
* fixes #7Cédric Bonhomme2014-06-10
|
* making pyagregator runnable by apacheFrançois Schmidts2014-06-09
| | | | | | | | | * adding bootstrap module for basic import * redoing logging (config, proper use of the logging module) * making secret part of config (random wouldn't work with apache since it uses different instances of python) * making server entry point not executing application if just imported * not writing file for opml when we can read it from memory
* supporting feed without date or with ill formated dateFrançois Schmidts2014-06-08
|
* Removed unused variable.Cédric Bonhomme2014-05-03
|
* keep the original title.Cédric Bonhomme2014-05-03
|
* Using lxml parser instead of html.parser, fixes #4.Cédric Bonhomme2014-05-03
|
* Better to send email without Flask-Mail.Cédric Bonhomme2014-04-27
|
* Improved code readability.Cédric Bonhomme2014-04-27
|
* Cleaned code.Cédric Bonhomme2014-04-27
|
* Separate indexes by users.Cédric Bonhomme2014-04-23
|
* Autoindexation of new articles (not on Heroku).Cédric Bonhomme2014-04-23
|
* Updated comments and log messages.Cédric Bonhomme2014-04-13
|
* Removed old feedgetter module.Cédric Bonhomme2014-04-13
|
* Test of the new crawler with gevent.Cédric Bonhomme2014-04-13
bgstack15