Package weka.core

Class RandomSample

java.lang.Object
weka.core.RandomSample

public class RandomSample extends Object
Class holding static utility methods for drawing random samples.
Version:
$Revision: 0$
Author:
eibe@cs.waikato.ac.nz
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    static int[]
    drawSortedDenseSample(int n, int N, Random r)
    Draws a sorted list of n distinct integers from 0, 1, ..., N - 1 based on the simple Algorithm A in
    static int[]
    drawSortedSample(int n, int N, Random r)
    Returns a sorted list of n distinct integers that are randomly chosen from 0, 1, ..., N - 1.
    static int[]
    drawSortedSparseSample(int n, int N, Random r)
    Draws a sorted list of n distinct integers from 0, 1, ..., N - 1 based on drawSparseSample() followed by radix sort.
    static int[]
    drawSparseSample(int n, int N, Random r)
    Draws n distinct integers from 0, 1, ..., N - 1, randomly ordered, using a partial Fisher-Yates shuffle and a hash map.
    static int[]
    Sorts the given array of non-negative integers in ascending order using LSD radix sort.

    Methods inherited from class java.lang.Object

    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • RandomSample

      public RandomSample()
  • Method Details

    • drawSortedSample

      public static int[] drawSortedSample(int n, int N, Random r) throws IllegalArgumentException
      Returns a sorted list of n distinct integers that are randomly chosen from 0, 1, ..., N - 1. It uses the drawSortedDenseSample() in this class if n > (int)(0.2 * N), which has complexity O(N). Otherwise, it uses drawSortedSparseSample(), which has complexity O(n).
      Parameters:
      n - the number of samples to take
      N - one greater than the largest integer that can possibly be included in the sample
      r - the random number generator to use
      Returns:
      Throws:
      IllegalArgumentException - if a sample cannot be taken based on the given parameters
    • drawSortedDenseSample

      public static int[] drawSortedDenseSample(int n, int N, Random r) throws IllegalArgumentException
      Draws a sorted list of n distinct integers from 0, 1, ..., N - 1 based on the simple Algorithm A in

      J.S. Vitter (1987) "An Efficient Algorithm for Sequential Random Sampling". ACM Trans Math Software, 13(1).

      This algorithm has time complexity O(N) but only requires O(n) random numbers to be generated. Space complexity is O(n). Useful if 0 << n / N.

      Throws:
      IllegalArgumentException
    • drawSortedSparseSample

      public static int[] drawSortedSparseSample(int n, int N, Random r) throws IllegalArgumentException
      Draws a sorted list of n distinct integers from 0, 1, ..., N - 1 based on drawSparseSample() followed by radix sort. The time complexity of this method is O(n). Useful if n << N.
      Throws:
      IllegalArgumentException
    • radixSortOfPositiveIntegers

      public static int[] radixSortOfPositiveIntegers(int[] a)
      Sorts the given array of non-negative integers in ascending order using LSD radix sort. The result will be undefined if negative integers are included in the input.
      Parameters:
      a - the array to be sorted
      Returns:
      the array with the result;
    • drawSparseSample

      public static int[] drawSparseSample(int n, int N, Random r) throws IllegalArgumentException
      Draws n distinct integers from 0, 1, ..., N - 1, randomly ordered, using a partial Fisher-Yates shuffle and a hash map. The idea of using a hash map is from

      D.N. Bui (2015) "CacheDiff: Fast Random Sampling" https://arxiv.org/abs/1512.00501

      This algorithm has time and space complexity O(n). Useful if n << N.

      Throws:
      IllegalArgumentException