# URL 'greylist': save matching urls to one side, to eyeball later and decide if # they should be included after all or whether it was okay to have skipped them # FORMAT: # precede URL by ^ to greylist urls that match the given prefix # succeed URL by $ to greylist urls that match the given suffix # ^url$ will greylist urls that match the given url completely # Without either ^ or $ symbol, urls containing the given url will get greylisted # Product sites: unwanted auto-translation pages of online product stores and other websites /product/ /products/ /product-page/ /product-category/ ledlamp.china-led-lighting.com ledpar64.china-led-lighting.com ledwallwasher.china-led-lighting.com abacre.com cn-huafu.net apteka.social # not product stores but autotranslated? 192-168-1-1l.com 19216811login.club 19216811login.club 1videosmusica.com 256file.com # already in greylisting of all .ru #7773033.ru #abali.ru #allbeautyone.ru aqualuz.org # if page doesn't load and can't be tested 1videosmusica.com www.kiterewa.pl # MANUALLY INSPECTED URLS AND ADDED TO GREYLIST # license plate site? - already in greylisting of all .ru #eba.com.ru # As per archive.org, there's just a photo on the defunct page at this site # And the picture label and filename is probably Japanese agri.mine.utsunomiya-u.ac.jp # seems to be Indonesian or Malaysian Bible rather than in Maori or any Polynesian language alkitab.life:2022 # appears defunct alixira.com # single seedURL was not a page in Maori, but global languages. # And the rest of the domain appears to be in English. #anglican.org # but we want the seedURLs from justus.anglican.org, # so grab anglican.org anyway ### TLDs that we greylist - any exceptions will be in the whitelist # Our list of .ru and .pl domains were not relevant .ru/ .pl/ .tk/