Usage

Installation

To use pairwise-ranking, first install it using pip:

$ pip install pairwise-ranking

Loading data

Match data may be imported from a variety of formats: .gml files, adjacency matrices, and lists of matches. The function ranking.read_match_list() attempts to import the data in these formats. Examples of data sets are given in the \data folder, and are cited in Data sets.

ranking.read_match_list()

Read list of matches from a file, attempting to detect the appropriate file format among gml, match_list, or adjacency_matrix formats.

Parameters:

filename (str) – Input filename

Returns:

List of matches, each represented by a dict of the winner and loser.

Return type:

list

For a specific file format, the more specific functions can be used:

ranking.read_match_list_from_match_list()

Read list of matches from a file where each line represents a match in the format “winner loser”.

Parameters:

filename (str) – Input filename

Raises:

AssertionError – If a line is not in “winner loser” format. Labels should not themselves include spaces.

ranking.read_match_list_from_gml()

Read list of matches as the edges in a gml network file.

Parameters:

filename (str) – Input filename

Returns:

List of matches, each represented by a dict of the winner and loser.

Return type:

list

ranking.read_match_list_from_adj_matrix()

Read list of matches from an adjacency matrix which counts the number of times each player beats another. In lieu of labels, assign in the form player_i.

Parameters:

filename (str) – Input filename

Returns:

List of matches, each represented by a dict of the winner and loser.

Return type:

list

Inference

For the models implemented in this package, described in models, point estimates of the strength scores can be found with the function ranking.scores().

ranking.scores()

Get fitted scores of players in a given model.

Parameters:
  • match_list (list) – List of matches, each represented by a dict of the winner and loser.

  • model_name (str, optional) – Model used for fitting. Defaults to ‘depth_and_luck’. Options: {‘depth_and_luck’, ‘depth_only’, ‘luck_only’, ‘logistic_prior’}.

  • num_samples (int, optional) – Number of samples used per chain for MCMC sampling, defaults to 5000

  • num_chains (int, optional) – Number of chains used for MCMC sampling, defaults to 4

  • force_mode (str or None, optional) – Optionally force the point estimate mode to be either ‘average’ or ‘MAP’, otherwise defaults to ‘MAP’ when more than 1000 players are present.

Returns:

pandas DataFrame with columns of the labels, inferred scores, and score errors for each player.

Return type:

DataFrame

Listed in decreasing order of the score estimates, the rankings from a match_list may be found:

ranking.ranks()

Find the rankings of the players according to a specified model.

Parameters:
  • match_list (list) – List of matches, each represented by a dict of the winner and loser.

  • model_name (str, optional) – Model used for fitting. Defaults to ‘depth_and_luck’. Options: {‘depth_and_luck’, ‘depth_only’, ‘luck_only’, ‘logistic_prior’}.

  • num_samples (int, optional) – Number of samples used per chain for MCMC sampling, defaults to 5000

  • num_chains (int, optional) – Number of chains used for MCMC sampling, defaults to 4

  • force_mode (str or None, optional) – Optionally force the point estimate mode to be either ‘average’ or ‘MAP’, otherwise defaults to ‘MAP’ when more than 1000 players are present.

Returns:

Ranked list of the players in descending order of strength.

Return type:

list

We can also infer the probability that an outcome between two players might occur:

ranking.probability()

Inferred probability that one player will beat another, according to match_list data.

Parameters:
  • match_list (list) – List of matches, each represented by a dict of the winner and loser.

  • winner_label (str) – Label of the desired winner

  • loser_label (str) – Label of the desired loser

  • model_name (str, optional) – Model used for fitting. Defaults to ‘depth_and_luck’. Options: {‘depth_and_luck’, ‘depth_only’, ‘luck_only’, ‘logistic_prior’}.

  • num_samples (int, optional) – Number of samples used per chain for MCMC sampling, defaults to 5000

  • num_chains (int, optional) – Number of chains used for MCMC sampling, defaults to 4

  • force_mode (str or None, optional) – Optionally force the point estimate mode to be either ‘average’ or ‘MAP’, otherwise defaults to ‘MAP’ when more than 1000 players are present.

Raises:

AssertionError – If winner_label or loser_label is not present in match_list.

Returns:

Tuple of the inferred probability and the error in the estimation.

Return type:

tuple

Sampling

We implement a wrapper for Hamiltonian Monte Carlo (HMC) sampling via pystan for the models considered in this package:

ranking.samples()

Get MCMC samples from the model fit to a match_list.

Parameters:
  • match_list (list) – List of matches, each represented by a dict of the winner and loser.

  • model_name (str, optional) – Model used for fitting. Defaults to ‘depth_and_luck’. Options: {‘depth_and_luck’, ‘depth_only’, ‘luck_only’, ‘logistic_prior’}.

  • num_samples (int, optional) – Number of samples used per chain for MCMC sampling, defaults to 10000

  • num_chains (int, optional) – Number of chains used for MCMC sampling, defaults to 4

Returns:

pandas DataFrame containing sampled draws of scores and relevant parameters.

Return type:

DataFrame

These samples may also be used to visualize the posterior distribution of the depth and luck in the full model using matplotlib.pyplot:

ranking.draw_depth_and_luck_posterior()

Draw the posterior distribution of the luck and depth parameters from sampled values of the depth_and_luck model.

Parameters:
  • match_list (list) – List of matches, each represented by a dict of the winner and loser.

  • num_samples (int, optional) – Number of samples used per chain for MCMC sampling, defaults to 5000

  • num_chains (int, optional) – Number of chains used for MCMC sampling, defaults to 4