Published March 31, 2005 | Version v1
Dataset Open

xfree

Authors/Creators

Description

This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage repeatable, verifiable, refutable, and/or improvable predictive models of software engineering.

  1. Title: XFree86 Repository Transaction Data

  2. Sources: (a) Original creators of database: Bart Massey 01 503 725-5393 Computer Science Dept. PO Box 751 MS CMPS Portland State University Portland, OR USA 97207-0751 bart@cs.pdx.edu

(b) Donor of database: owner

(c) Date received: 31 March 2005

  1. Past Usage: none

  2. Relevant Information:

This dataset was assembled by analyzing the publicly-available CVS archives of the XFree86 Project http://xfree86.org)) using a modified version of CVSAnalY (https://github.com/MetricsGrimoire/CVSAnalY) It is intended for a wide variety of uses, and thus no dependent variable is specified. See Massey’s PROMISE 2005 paper, “Longitudinal Analysis of Long-Timescale Open Source Repository Data”, for further information. Note that on 487 occasions, a negative lines-added count was recorded by CVSAnalY, apparently due to a bug. These entries should probably be repaired or at least filtered out.

  1. Number of Instances: 175658

  2. Number of Attributes: 10 (incl. 2 inferred)

  3. Attribute information:

    • Inferred filetype:
      • non-numeric—nominal
      • 9 values (documentation,images,i18n,ui,multimedia,code,build,devel-doc,unknown)
    • File Pathname: UNIX relative path name
      • non-numeric—structured
      • 1935 directories, 27009 files
    • Revision: dotted-decimal revision string
      • non-numeric—structured
      • 7182 unique revision numbers
    • Author ID: integer identifier
      • non-numeric—nominal
      • 21 unique authors
    • Lines added: integer
      • numeric—integer
      • MIN: 0 MAX: 699789 MEAN: 124.274 STDEV: 2450.66
      • MIN: 0 MAX: 70554 MEAN: 41.3577 STDEV: 407.737
      • (excl. images, multimedia)
      • Note: on 487 occasions, a negative lines-added count was recorded by CVSAnalY, apparently due to a bug. In the summary statistics reported in this comment, these additions are not counted.
    • Lines removed: integer
      • numeric—integer
      • MIN: 0 MAX: 123751 MEAN: 27.8503 STDEV: 624.008
      • MIN: 0 MAX: 55664 MEAN: 17.3073 STDEV: 294.047
      • (excl. images, multimedia)
    • File has since been removed (“in Attic”): boolean (0, 1)
      • non-numeric—nominal
      • 40.67% positive
    • Commit has CVS_SILENT flag: boolean (0, 1)
      • non-numeric—nominal
      • never set
    • Inferred that committer was not author: boolean (0, 1)
      • non-numeric—nominal
      • 1.04% positive
    • Commit date: date
      • MIN: “1994-04-27 07:07:10” MAX: “2005-02-19 02:22:07”
      • Note: CVS has trouble with timezones; we assume all dates are UTC.
  4. Missing Attribute Values:

  • unknown inferred filetype (attr 1): 18719
  1. Class Distribution: N/A

Files

README.txt

Files (15.0 MB)

Name Size Download all
md5:123b867fceeb89fe8a45fa6c493385ac
223 Bytes Preview Download
md5:e08fdf0b7717fc100848bcabcf04e360
15.0 MB Download