ivxj
====

.. py:module:: ivxj


Submodules
----------

.. toctree::
   :maxdepth: 1

   /autoapi/ivxj/delete_period_obs/index
   /autoapi/ivxj/gen_ivx/index
   /autoapi/ivxj/ivxj/index
   /autoapi/ivxj/split_mat_into_cells/index
   /autoapi/ivxj/within_trans/index
   /autoapi/ivxj/xj/index


Attributes
----------

.. autoapisummary::

   ivxj.__version__


Functions
---------

.. autoapisummary::

   ivxj.ivxj


Package Contents
----------------

.. py:data:: __version__

.. py:function:: ivxj(data, rhoz, identity=None, time=None, y_name=None, x_name=None)

   IVXJ Estimation for Unbalanced Panel Data (Univariate Case).

   This function performs Instrumental Variable with XJ (IVXJ) estimation on unbalanced panel data in a univariate setting. It sorts the panel data, extracts dependent and independent variables, and applies the IVXJ method. The function returns the IVX and IVXJ estimates of the coefficient, the standard error, and the XJ estimate of rho.

   The method is designed for use in unbalanced panel data, where the number of time periods may differ across individual entities.

   :param data: A DataFrame containing unbalanced panel data. It must include columns for an entity identifier, a time variable, and both dependent and independent variables.
   :type data: pandas.DataFrame
   :param rhoz: A user-defined IVX parameter, denoted as rho_z, controlling the strength of persistence in the instruments.
   :type rhoz: float
   :param identity: The name of the column in `data` representing the individual entity (cross-sectional unit). If None, the first column of `data` is used as the identity column.
   :type identity: str, optional
   :param time: The name of the column in `data` representing the time dimension. If None, the second column of `data` is used as the time variable.
   :type time: str, optional
   :param y_name: The name of the column in `data` representing the dependent variable. If None, the third column of `data` is used as the dependent variable.
   :type y_name: str, optional
   :param x_name: The name of the column in `data` representing the independent variable. If None, the fourth column of `data` is used as the independent variable.
   :type x_name: str, optional

   :returns: * **btaHat** (*numpy.ndarray*) -- The IVX estimate of the coefficient.
             * **btaHatDebias** (*numpy.ndarray*) -- The IVXJ estimate of the coefficient.
             * **se** (*numpy.ndarray*) -- The standard error of the IVXJ estimate.
             * **rhoHat** (*float*) -- The XJ estimate of rho.

   :raises KeyError: If the specified column names for identity, time, y, or x do not exist in the `data` DataFrame.
   :raises ValueError: If the `data` does not contain enough columns to assign variables when default column indices are used.

   .. rubric:: Examples

   Example 1: Applying IVXJ to an unbalanced panel dataset

   >>> import pandas as pd
   >>> import numpy as np
   >>> data = pd.DataFrame({
   ...     'id': np.repeat([1, 2], 21),
   ...     'time': np.tile(np.arange(1, 22), 2),
   ...     'y': np.random.randint(0, 2, 42),
   ...     'x': np.round(np.random.uniform(1, 3, 42), 1)
   ... })
   >>> rhoz = 0.9
   >>> btaHat, btaHatDebias, se, rhoHat = ivxj(data, rhoz, 'id', 'time', 'y', 'x')
   >>> print(btaHat, btaHatDebias, se, rhoHat)

   Example 2: Using default columns for entity, time, dependent, and independent variables

   >>> ivxj(data, rhoz)

   .. rubric:: References

   For more details on the IVXJ method, see the original paper by Liao, Mei and Shi (2024).