ivxj

Submodules

Attributes

__version__

Functions

ivxj(data, rhoz[, identity, time, y_name, x_name])

IVXJ Estimation for Unbalanced Panel Data (Univariate Case).

Package Contents

ivxj.__version__
ivxj.ivxj(data, rhoz, identity=None, time=None, y_name=None, x_name=None)[source]

IVXJ Estimation for Unbalanced Panel Data (Univariate Case).

This function performs Instrumental Variable with XJ (IVXJ) estimation on unbalanced panel data in a univariate setting. It sorts the panel data, extracts dependent and independent variables, and applies the IVXJ method. The function returns the IVX and IVXJ estimates of the coefficient, the standard error, and the XJ estimate of rho.

The method is designed for use in unbalanced panel data, where the number of time periods may differ across individual entities.

Parameters:
  • data (pandas.DataFrame) – A DataFrame containing unbalanced panel data. It must include columns for an entity identifier, a time variable, and both dependent and independent variables.

  • rhoz (float) – A user-defined IVX parameter, denoted as rho_z, controlling the strength of persistence in the instruments.

  • identity (str, optional) – The name of the column in data representing the individual entity (cross-sectional unit). If None, the first column of data is used as the identity column.

  • time (str, optional) – The name of the column in data representing the time dimension. If None, the second column of data is used as the time variable.

  • y_name (str, optional) – The name of the column in data representing the dependent variable. If None, the third column of data is used as the dependent variable.

  • x_name (str, optional) – The name of the column in data representing the independent variable. If None, the fourth column of data is used as the independent variable.

Returns:

  • btaHat (numpy.ndarray) – The IVX estimate of the coefficient.

  • btaHatDebias (numpy.ndarray) – The IVXJ estimate of the coefficient.

  • se (numpy.ndarray) – The standard error of the IVXJ estimate.

  • rhoHat (float) – The XJ estimate of rho.

Raises:
  • KeyError – If the specified column names for identity, time, y, or x do not exist in the data DataFrame.

  • ValueError – If the data does not contain enough columns to assign variables when default column indices are used.

Examples

Example 1: Applying IVXJ to an unbalanced panel dataset

>>> import pandas as pd
>>> import numpy as np
>>> data = pd.DataFrame({
...     'id': np.repeat([1, 2], 21),
...     'time': np.tile(np.arange(1, 22), 2),
...     'y': np.random.randint(0, 2, 42),
...     'x': np.round(np.random.uniform(1, 3, 42), 1)
... })
>>> rhoz = 0.9
>>> btaHat, btaHatDebias, se, rhoHat = ivxj(data, rhoz, 'id', 'time', 'y', 'x')
>>> print(btaHat, btaHatDebias, se, rhoHat)

Example 2: Using default columns for entity, time, dependent, and independent variables

>>> ivxj(data, rhoz)

References

For more details on the IVXJ method, see the original paper by Liao, Mei and Shi (2024).