tag rules
This table contains association rules for tags in the mangadex dataset. We can use these association rules to find relationships and generate recommendations between tags. Also take a look at the network visualization.
notes
We use the FPGrowth algorithm in PySpark to generate the association rules. We also note relevant terminology sourced from Wikipedia below.
support
Support is how often an itemset of tags is present in the dataset.
where is the identifier and itemset of a transaction.
confidence
Confidence is the percentage of times that a set of tags is present when another set of tags is present.
where is the rule.
lift
Lift gives a measure of how likely two sets of tags are independent of each other. When lift is greater than 1, then the sets are dependent. When lift is less than 1, then the sets are independent.