Usage

Installation

To use pairwise-ranking, first install it using pip:

$ pip install pairwise-ranking

Loading data

Match data may be imported from a variety of formats: .gml files, adjacency matrices, and lists of matches. The function ranking.read_match_list() attempts to import the data in these formats. Examples of data sets are given in the \data folder, and are cited in Data sets.

ranking.read_match_list()

Read list of matches from a file, attempting to detect the appropriate file format among gml, match_list, or adjacency_matrix formats.

Parameters:: filename (str) – Input filename
Returns:: List of matches, each represented by a dict of the winner and loser.
Return type:: list

For a specific file format, the more specific functions can be used:

ranking.read_match_list_from_match_list()

Read list of matches from a file where each line represents a match in the format “winner loser”.

Parameters:: filename (str) – Input filename
Raises:: AssertionError – If a line is not in “winner loser” format. Labels should not themselves include spaces.

ranking.read_match_list_from_gml()

Read list of matches as the edges in a gml network file.

Parameters:: filename (str) – Input filename
Returns:: List of matches, each represented by a dict of the winner and loser.
Return type:: list

ranking.read_match_list_from_adj_matrix()

Read list of matches from an adjacency matrix which counts the number of times each player beats another. In lieu of labels, assign in the form player_i.

Parameters:: filename (str) – Input filename
Returns:: List of matches, each represented by a dict of the winner and loser.
Return type:: list

Inference

For the models implemented in this package, described in models, point estimates of the strength scores can be found with the function ranking.scores().

ranking.scores()

Get fitted scores of players in a given model.

Parameters:

match_list (list) – List of matches, each represented by a dict of the winner and loser.
model_name (str, optional) – Model used for fitting. Defaults to ‘depth_and_luck’. Options: {‘depth_and_luck’, ‘depth_only’, ‘luck_only’, ‘logistic_prior’}.
num_samples (int, optional) – Number of samples used per chain for MCMC sampling, defaults to 5000
num_chains (int, optional) – Number of chains used for MCMC sampling, defaults to 4
force_mode (str or None, optional) – Optionally force the point estimate mode to be either ‘average’ or ‘MAP’, otherwise defaults to ‘MAP’ when more than 1000 players are present.

Returns:

pandas DataFrame with columns of the labels, inferred scores, and score errors for each player.

Return type:

DataFrame

Listed in decreasing order of the score estimates, the rankings from a match_list may be found:

ranking.ranks()

Find the rankings of the players according to a specified model.

Parameters:

match_list (list) – List of matches, each represented by a dict of the winner and loser.
model_name (str, optional) – Model used for fitting. Defaults to ‘depth_and_luck’. Options: {‘depth_and_luck’, ‘depth_only’, ‘luck_only’, ‘logistic_prior’}.
num_samples (int, optional) – Number of samples used per chain for MCMC sampling, defaults to 5000
num_chains (int, optional) – Number of chains used for MCMC sampling, defaults to 4
force_mode (str or None, optional) – Optionally force the point estimate mode to be either ‘average’ or ‘MAP’, otherwise defaults to ‘MAP’ when more than 1000 players are present.

Returns:

Ranked list of the players in descending order of strength.

Return type:

list

We can also infer the probability that an outcome between two players might occur:

ranking.probability()

Inferred probability that one player will beat another, according to match_list data.

Parameters:

match_list (list) – List of matches, each represented by a dict of the winner and loser.
winner_label (str) – Label of the desired winner
loser_label (str) – Label of the desired loser
model_name (str, optional) – Model used for fitting. Defaults to ‘depth_and_luck’. Options: {‘depth_and_luck’, ‘depth_only’, ‘luck_only’, ‘logistic_prior’}.
num_samples (int, optional) – Number of samples used per chain for MCMC sampling, defaults to 5000
num_chains (int, optional) – Number of chains used for MCMC sampling, defaults to 4
force_mode (str or None, optional) – Optionally force the point estimate mode to be either ‘average’ or ‘MAP’, otherwise defaults to ‘MAP’ when more than 1000 players are present.

Raises:

AssertionError – If winner_label or loser_label is not present in match_list.

Returns:

Tuple of the inferred probability and the error in the estimation.

Return type:

tuple

Sampling

We implement a wrapper for Hamiltonian Monte Carlo (HMC) sampling via pystan for the models considered in this package:

ranking.samples()

Get MCMC samples from the model fit to a match_list.

Parameters:

match_list (list) – List of matches, each represented by a dict of the winner and loser.
model_name (str, optional) – Model used for fitting. Defaults to ‘depth_and_luck’. Options: {‘depth_and_luck’, ‘depth_only’, ‘luck_only’, ‘logistic_prior’}.
num_samples (int, optional) – Number of samples used per chain for MCMC sampling, defaults to 10000
num_chains (int, optional) – Number of chains used for MCMC sampling, defaults to 4

Returns:

pandas DataFrame containing sampled draws of scores and relevant parameters.

Return type:

DataFrame

These samples may also be used to visualize the posterior distribution of the depth and luck in the full model using matplotlib.pyplot:

ranking.draw_depth_and_luck_posterior()

Draw the posterior distribution of the luck and depth parameters from sampled values of the depth_and_luck model.

Parameters:

match_list (list) – List of matches, each represented by a dict of the winner and loser.
num_samples (int, optional) – Number of samples used per chain for MCMC sampling, defaults to 5000
num_chains (int, optional) – Number of chains used for MCMC sampling, defaults to 4