Commit graph

383 commits

Author SHA1 Message Date
jesopo
c79bd6d0ba utils.http.Response.decode() should default to detected encoding 2019-11-28 07:35:16 +00:00
jesopo
e4a5bd01e9 explicitly use "lxml" for finding page encoding 2019-11-26 14:34:48 +00:00
jesopo
8e9da0d681 _find_encoding takes bytes and soupifies now 2019-11-26 13:58:37 +00:00
jesopo
c898bc4be1 utils.http.request_many() shouldn't decode data for Response 2019-11-26 13:54:17 +00:00
jesopo
2d21dfa229 utils.http.Response.data should always be bytes - add .decode and .soup 2019-11-26 13:42:01 +00:00
jesopo
ed775ddbe3 remove parser from utils.http.Request, add Request.soup() 2019-11-26 11:35:56 +00:00
jesopo
93aea08818 utils.datetime.datetime_utcnow() -> utils.datetime.utcnow() 2019-11-25 18:18:09 +00:00
jesopo
6a6e789ec9 add cookies and .json() to utils.http.Response objects 2019-11-25 18:17:30 +00:00
jesopo
ab8bc65cc9 change utils.http.Request to be a dataclass 2019-11-25 13:42:10 +00:00
jesopo
4d30263315 give bitbot a unique User-Agent
closes #206
2019-11-20 14:42:34 +00:00
jesopo
bd176240d6 consecutive HASH_STOP chars at start shouldn't count as a stop 2019-11-19 14:45:47 +00:00
jesopo
554f21a84c HASH_STOP should still be respected if last character 2019-11-19 14:43:24 +00:00
jesopo
65023dc84b move "+5m" syntax parsing out to utils.parse 2019-11-18 15:57:23 +00:00
jesopo
fe25c6bc26 switch some utils.irc functions to use f-strings 2019-11-18 14:16:30 +00:00
jesopo
d89a3125ab truncate nickname hash operations to 64bit 2019-11-18 14:09:47 +00:00
jesopo
a8b1bd95f7 implement weechat-style djb2 colour hashing 2019-11-18 13:51:55 +00:00
jesopo
a7e21abfad import missing comma, 'Events'->'Event' 2019-11-18 12:10:20 +00:00
jesopo
06161b326c remove unused imports in src/utils/__init__.py 2019-11-18 12:09:18 +00:00
jesopo
9c4902dcfe "EventsResultsError"->"EventResultsError", move errors to utils.error 2019-11-18 12:06:59 +00:00
jesopo
5d55086847 move utils.consts.BITBOT_MAGIC to utils.decorators 2019-11-15 14:09:35 +00:00
jesopo
5d01db8514 move all datetime-related code from utils/__init__ to utils.datetime 2019-11-15 13:59:09 +00:00
jesopo
5e8cf06a45 dont expose has_magic/get_magic in utils/__init__.py, ref directly 2019-11-15 13:41:03 +00:00
jesopo
bfcf40edd7 split some stuff out of utils/__init__.py 2019-11-15 13:39:24 +00:00
jesopo
2cb55306c3 show first-words datestamp on !words output 2019-11-15 12:13:16 +00:00
jesopo
a1e9aabb84 add typehinting for foreground/background 2019-11-13 10:47:58 +00:00
jesopo
5d2dd9178f only set color_finished=True when is_background, otherwise next char sets it 2019-11-13 10:43:15 +00:00
jesopo
b7bfd414be hash_colorize code should first be looked up in HASH_COLORS 2019-11-11 12:48:37 +00:00
jesopo
727fb3427d .lower() strings when hash-colorizing them 2019-11-11 12:40:36 +00:00
jesopo
ac30f8d4cc don't use hash() for hashed colorising as it's not stable through restarts 2019-11-11 12:13:46 +00:00
jesopo
2ad8623eb3 add utils.irc.hash_colorize() to color a string by the string's hash 2019-11-11 12:06:16 +00:00
jesopo
09fe1c5a70 don't stop parsing colors at comma 2019-11-04 13:33:10 +00:00
jesopo
2b001e1ec6 ' ' -> " " 2019-10-31 13:06:26 +00:00
Valentin Lorentz
fbf8cd1a16 Fix type errors detected by 'mypy --ignore-missing-imports src'. 2019-10-30 22:26:59 +01:00
jesopo
3634b72622 add utils.date_human() - use it in badges.py 2019-10-30 10:25:07 +00:00
jesopo
4d85c3d1e0 utils.parse doesn't need to import utils 2019-10-29 18:03:03 +00:00
jesopo
080bcef3a0 'from src.utils import' -> 'from . import' 2019-10-29 18:02:50 +00:00
jesopo
40a340e94f utils.cli shouldn't know about Database 2019-10-29 18:00:38 +00:00
jesopo
46e4b75f6b utils.irc doesn't need to know about the whole of utils 2019-10-29 18:00:19 +00:00
jesopo
8983338680 move src/utils/irc/__init__.py to src/utils/irc.py 2019-10-28 10:57:19 +00:00
jesopo
7ee65f8f8c remove src/utils/irc/protocol.py 2019-10-28 10:56:33 +00:00
jesopo
1bddc3b37f Revert "remove unneeded import"
This reverts commit 8425c11c97.
2019-10-27 10:32:13 +00:00
jesopo
8425c11c97 remove unneeded import 2019-10-27 10:25:37 +00:00
jesopo
8f4b5a0e70 move IRCLine related code from utils.irc to IRCLine.py 2019-10-27 10:19:00 +00:00
jesopo
3a755bb15f don't consume past 2nd digit in e.g. "\03033,123" 2019-10-25 17:12:24 +01:00
jesopo
f64131a10f support utf8 hostnames by punycode (idna) encoding 2019-10-18 10:58:24 +01:00
jesopo
2c19bdb949 add a fairly basic file locking mechanism with src/LockFile.py
closes #96
2019-10-10 12:11:03 +01:00
jesopo
0331b763ff refactor multi-line-to-line normalisation to utils.parse.line_normalise(), use it in rss.py
closes #174
2019-10-10 10:33:18 +01:00
jesopo
68aa89f16f commit FunctionSetting changes i forgot to commit yesterday 2019-10-08 11:38:56 +01:00
jesopo
9ab817ca58 parse out content_type in Response ctor 2019-10-05 22:56:56 +01:00
jesopo
b2473a4ac4 parse content-type out in utils.http.request, put it on Response object 2019-10-04 13:07:09 +01:00
jesopo
3466a3c43e Allow utils.Setting_ parse functions to throw detailed errors 2019-10-04 10:25:48 +01:00
jesopo
f306213cb8 'is_localhost()' -> 'host_permitted()' 2019-09-30 15:15:20 +01:00
jesopo
b9c64b7cf1 use ipaddress is_loopback etc to do better forbidden ranges
closes #87
2019-09-30 15:12:01 +01:00
jesopo
7db17c0962 add utils.parse.try_int() because .isdigit() isnt good enough 2019-09-26 13:44:38 +01:00
jesopo
2f49fb99e9 assume http fallback_encoding by content-type (utf8 for json) 2019-09-25 15:32:09 +01:00
jesopo
72649a90c2 only BeautifulSoup for finding encoding when it's a html-ish type 2019-09-20 13:38:00 +01:00
jesopo
efc0e197e5 Allow passing source Hostmask to IRCBatch 2019-09-19 18:16:10 +01:00
jesopo
e34259f967 log call was replaced with Exception but [] on args remained 2019-09-19 15:30:27 +01:00
jesopo
88a69aaa66 give Requests, use them in utils.http.request_many() 2019-09-19 14:54:44 +01:00
jesopo
d8e3a1c7ee utils.http.request_() has no self, let alone self.log 2019-09-19 14:02:48 +01:00
jesopo
b69c9146b2 should be using pair_start/pair_end throughout for 2019-09-19 13:51:27 +01:00
jesopo
cd0d39ee5e also show "bad" data in HTTPParsingException when a message is provided 2019-09-18 14:20:59 +01:00
jesopo
312f8906ae show "bad" data in HTTPParsingException message 2019-09-18 10:52:05 +01:00
jesopo
a003c97fba move q.close() to where it will be called even if deadline is hit 2019-09-18 10:24:01 +01:00
jesopo
dce6eee8c9 move _raise_deadline() out of except block to clean up printed stacktrace 2019-09-18 10:21:40 +01:00
jesopo
ee6360be22 don't check already-read data when checking for too-large requests
this check was here because the first read will return empty if it was an
invalid byte sequence for e.g. gzip because we needed to receive more data. the
second read will always return data (not decoded) so regardless of what the
already-read data is, the second read is the only criteria we need.
2019-09-17 17:33:23 +01:00
jesopo
1ac7f2697e log which URL caused an error in request_many 2019-09-17 17:09:19 +01:00
jesopo
98545a9fb4 only decode content-types in DECODE_CONTENT_TYPES 2019-09-17 16:12:03 +01:00
jesopo
8ca0d30fef Response.__init__() needs encoding now 2019-09-17 14:11:12 +01:00
jesopo
b7dd78ef1a restore 5 second (instead of default 10) deadline for http.request 2019-09-17 13:44:14 +01:00
jesopo
94c3ff962b use utils.deadline_process() in utils.http._request() so background threads can
call _request()
2019-09-17 13:41:11 +01:00
jesopo
fa95eaa9eb add .get() to CaseInsensitiveDict 2019-09-17 13:40:37 +01:00
jesopo
d454f9b732 use Queue.get() with timeout, not Process.join() for timeout
this was because the threads spawned by multiprocessing.Queue seemed to be
making Process.join() believe the subprocess had not exited.
2019-09-17 13:39:23 +01:00
jesopo
1ed14f9a17 first draft of multiprocess.Process deadline system 2019-09-17 11:56:30 +01:00
jesopo
334d580c57 'seperate_hostmask()' -> 'parse_hostmask()' 2019-09-16 18:43:57 +01:00
jesopo
47735421b8 add json_body arg to Request to json-encode body, only return from body if
not null
2019-09-16 10:57:18 +01:00
jesopo
f9d13dc373 support '0' as an IntSetting value 2019-09-15 22:22:30 +01:00
jesopo
d950eb3660 add utils.SensitiveSetting, to .format() hide value 2019-09-12 12:17:31 +01:00
jesopo
ba0911f2e7 add utils.Setting.format() so subtypes can format differently 2019-09-12 12:17:09 +01:00
jesopo
9d33354d16 translate INVITE from [channel_name, target] to [target, channel_name] 2019-09-12 11:24:25 +01:00
jesopo
540c7b8c44 Revert "INVITE should be [channel_name, target]"
This reverts commit f3d8ffad2c.
2019-09-12 11:23:29 +01:00
jesopo
f3d8ffad2c INVITE should be [channel_name, target] 2019-09-12 11:21:29 +01:00
jesopo
77f50187c5 allow Requests to specify a useragent 2019-09-12 10:41:50 +01:00
jesopo
9d6a3982ed add a helper utils.http.Client static object 2019-09-11 17:53:49 +01:00
jesopo
51dc26d113 add proxy to Request objects 2019-09-11 17:53:37 +01:00
jesopo
4a97c9eb0d refactor utils.http.requests to support a Request object 2019-09-11 17:44:07 +01:00
jesopo
8f8cf92ae2 automatically decode certain http content types 2019-09-11 15:28:13 +01:00
jesopo
a9b106c6be Don't try to .decode non-html things, default iso-lat-1 for non-html too 2019-09-09 16:17:26 +01:00
jesopo
b83f5d9e30 add flag to disable encoding detection 2019-09-09 14:59:08 +01:00
jesopo
5ef2b7af27 'str.split' -> 's.split' 2019-09-09 14:53:11 +01:00
jesopo
1df82c1cb2 still default to iso-latin-1 if no on-page or in-header content-type is present 2019-09-09 14:48:26 +01:00
jesopo
0a67659637 only look for <meta>-related tags when there are meta tags 2019-09-09 14:39:19 +01:00
jesopo
0a1077c5cd add explicit None return for _find_encoding (mypy) 2019-09-09 14:25:01 +01:00
jesopo
ff9c82bf67 change utils.http.request to best-effort detect on-page encoding
closes #113
2019-09-09 14:11:18 +01:00
jesopo
007bb78d30 make utils.from_pretty_time() format much stricter 2019-09-04 11:22:56 +01:00
jesopo
397cfa8e7e correctly qualify DeadlineExceededException namespace 2019-09-03 14:54:59 +01:00
jesopo
b7b2f31c1c use utils.deadline() in utils.http.request, not raw sigalrm 2019-09-02 15:50:21 +01:00
jesopo
d42d694e64 move deadline alarm time check inside try/finally 2019-09-02 15:50:12 +01:00
jesopo
9cc1ee98eb Pass the content of a webpage to HTTPParsingException 2019-09-02 13:27:44 +01:00
jesopo
408b89aeb7 use \S+ for url regex (for non-ascii chars), use url_sanitize to catch <> 2019-09-02 13:25:48 +01:00