Domain-only filter lists support #481

Closed
opened 2026-03-04 00:33:23 -05:00 by deekerman · 11 comments
Owner

Originally created by @furunos on GitHub (Jan 6, 2019).

When adding il.com to the blocking rule, gmail.com is blocked and in trouble.

edit (by ameshkov) see this comment for the details: https://github.com/AdguardTeam/AdGuardHome/issues/535#issuecomment-452489005

Originally created by @furunos on GitHub (Jan 6, 2019). When adding il.com to the blocking rule, gmail.com is blocked and in trouble. **edit (by ameshkov)** see this comment for the details: https://github.com/AdguardTeam/AdGuardHome/issues/535#issuecomment-452489005
deekerman 2026-03-04 00:33:23 -05:00
Author
Owner

@furunos commented on GitHub (Jan 6, 2019):

I suggest adding the following code.

dnsfilter/rule_to_regexp.go
@@ -1,11 +1,13 @@
 package dnsfilter
 
 import (
+	"regexp"
 	"strings"
 )
 
 func ruleToRegexp(rule string) (string, error) {
 	const hostStart = `(?:^|\.)`
+	const hostStartNoSub = `^`
 	const hostEnd = `$`
 
 	// empty or short rule -- do nothing
@@ -23,6 +25,12 @@
 	if rule[0] == '|' && rule[1] == '|' {
 		sb.WriteString(hostStart)
 		rule = rule[2:]
+	} else {
+		sb.WriteString(hostStartNoSub)
+		ruleRegexp := regexp.MustCompile(`[a-zA-Z]$`)
+		if ruleRegexp.MatchString(rule) {
+			rule += `^`
+		}
 	}
 
 	for i, r := range rule {
@furunos commented on GitHub (Jan 6, 2019): I suggest adding the following code. ``` dnsfilter/rule_to_regexp.go @@ -1,11 +1,13 @@ package dnsfilter import ( + "regexp" "strings" ) func ruleToRegexp(rule string) (string, error) { const hostStart = `(?:^|\.)` + const hostStartNoSub = `^` const hostEnd = `$` // empty or short rule -- do nothing @@ -23,6 +25,12 @@ if rule[0] == '|' && rule[1] == '|' { sb.WriteString(hostStart) rule = rule[2:] + } else { + sb.WriteString(hostStartNoSub) + ruleRegexp := regexp.MustCompile(`[a-zA-Z]$`) + if ruleRegexp.MatchString(rule) { + rule += `^` + } } for i, r := range rule { ```
Author
Owner

@ameshkov commented on GitHub (Jan 6, 2019):

You need to add ||il.com^ if you want to block il.com and its subdomains

@ameshkov commented on GitHub (Jan 6, 2019): You need to add `||il.com^` if you want to block `il.com` and its subdomains
Author
Owner

@furunos commented on GitHub (Jan 7, 2019):

If you use the following filter, AGH will block "gmail.com" because "il.com" is written.

https://v.firebog.net/hosts/Airelle-hrsk.txt

@furunos commented on GitHub (Jan 7, 2019): If you use the following filter, AGH will block "gmail.com" because "il.com" is written. https://v.firebog.net/hosts/Airelle-hrsk.txt
Author
Owner

@furunos commented on GitHub (Jan 8, 2019):

If I use a domain only filter like "il.com", AGH blocks by domain partial match. Therefore, the AGH blocks domains not intended by the user. This is inconvenient because many filters that are published can not be used.

@furunos commented on GitHub (Jan 8, 2019): If I use a domain only filter like "il.com", AGH blocks by domain partial match. Therefore, the AGH blocks domains not intended by the user. This is inconvenient because many filters that are published can not be used.
Author
Owner

@hmage commented on GitHub (Jan 8, 2019):

Hi!

AdGuard filter rules follow the standard ad blocking syntax, described here, here and here. We have simplified the syntax and removed features that make no sense in dns context, but essentially we are matching hostnames instead of full urls against the filters.

Since standard ad blocking syntax defines that wildcard matching in the beginning and end is done by default, you want an anchor at the beginning || and end ^.

We also support hosts format matching, but you will need to fix your rules by adding an IP address at the beginning of a line.

@hmage commented on GitHub (Jan 8, 2019): Hi! AdGuard filter rules follow the standard ad blocking syntax, described [here](https://kb.adguard.com/en/general/how-to-create-your-own-ad-filters), [here](https://adblockplus.org/filters) and [here](https://github.com/gorhill/uBlock/wiki/Static-filter-syntax). We have simplified the syntax and removed features that make no sense in dns context, but essentially we are matching hostnames instead of full urls against the filters. Since standard ad blocking syntax defines that wildcard matching in the beginning and end is done by default, you want an anchor at the beginning `||` and end `^`. We also support [hosts format matching](https://en.wikipedia.org/wiki/Hosts_(file)), but you will need to fix your rules by adding an IP address at the beginning of a line.
Author
Owner

@furunos commented on GitHub (Jan 8, 2019):

I think that my idea is simple, but I respect your policy. And how about adding a function to exclude this filter in order to know that user can not use the domain only filter ?

@furunos commented on GitHub (Jan 8, 2019): I think that my idea is simple, but I respect your policy. And how about adding a function to exclude this filter in order to know that user can not use the domain only filter ?
Author
Owner

@furunos commented on GitHub (Jan 8, 2019):

il.com = ||il.com

Do not you assume that "il.com" and "|| il.com" are the same ? This can reduce the number of wrong blocks by domains list. Implementation is also simple.

dnsfilter/rule_to_regexp.go
@@ -20,8 +20,9 @@
 
 	var sb strings.Builder
 
+	sb.WriteString(hostStart)
 	if rule[0] == '|' && rule[1] == '|' {
-		sb.WriteString(hostStart)
 		rule = rule[2:]
 	}
@furunos commented on GitHub (Jan 8, 2019): ```il.com``` = ```||il.com``` Do not you assume that "il.com" and "|| il.com" are the same ? This can reduce the number of wrong blocks by domains list. Implementation is also simple. ``` dnsfilter/rule_to_regexp.go @@ -20,8 +20,9 @@ var sb strings.Builder + sb.WriteString(hostStart) if rule[0] == '|' && rule[1] == '|' { - sb.WriteString(hostStart) rule = rule[2:] } ```
Author
Owner

@ameshkov commented on GitHub (Jan 8, 2019):

As I recall, the malwaredomains provides justdomains list, and there's also firebog.net. Maybe you're right and it makes sense to handle the case of a simple domain name matching.

I guess we could detect that the line looks like a domain name and automatically prepend || to it.

Reopening this issue as a feature request, we'll discuss it a bit later then.

@ameshkov commented on GitHub (Jan 8, 2019): As I recall, the malwaredomains provides `justdomains` list, and there's also firebog.net. Maybe you're right and it makes sense to handle the case of a simple domain name matching. I guess we could detect that the line looks like a domain name and automatically prepend `||` to it. Reopening this issue as a feature request, we'll discuss it a bit later then.
Author
Owner

@Apexgh0st commented on GitHub (Jan 14, 2019):

Some Domain-only filter lists would have "#" or sometimes "$" at the start for notes ex. https://ransomwaretracker.abuse.ch/downloads/RW_DOMBL.txt . Just wondering if those special characters would conflict your syntax rules in the future (I'm not familiar with adguard syntax/adblock rules).

@Apexgh0st commented on GitHub (Jan 14, 2019): Some Domain-only filter lists would have "#" or sometimes "$" at the start for notes ex. https://ransomwaretracker.abuse.ch/downloads/RW_DOMBL.txt . Just wondering if those special characters would conflict your syntax rules in the future (I'm not familiar with adguard syntax/adblock rules).
Author
Owner

@ameshkov commented on GitHub (Jan 14, 2019):

No prob, these lines will be discarded.

@ameshkov commented on GitHub (Jan 14, 2019): No prob, these lines will be discarded.
Author
Owner

@szolin commented on GitHub (Oct 2, 2020):

AGH v0.103 doesn't block gmail.com in case il.com rule is active.

@szolin commented on GitHub (Oct 2, 2020): AGH v0.103 doesn't block `gmail.com` in case `il.com` rule is active.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/AdGuardHome#481
No description provided.