The CAS Loss Reserving Database

The CAS Loss Reserving Database is a collection of loss data originally compiled by Glenn Meyers intended to be used for claims reserving studies. The data includes major personal and commercial lines of business from U.S. property casualty insurers. The claims data comes from Schedule P Analysis of Losses and Loss Expenses in the National Association of Insurance Commissioners (NAIC) database. The final product is a dataset that contains run-off triangles of six lines of business for all U.S. property casualty insurers. The triangle data correspond to claims of accident year 1988-1997 with 10 years development lag. Both upper and lower triangles are included so that one could use the data to develop a model and then test its performance retrospectively 1.

The CAS Loss Reserving Database contains the following fields:
loss_key

Currently limited to "com_auto", but will be expanded in a future release (see mapping below)

grcode

NAIC company code (including insurer groups and single insurers)

grname

NAIC company name (including insurer groups and single insurers)

origin

Accident year

dev

Development year

incrd_loss

Incurred losses and allocated expenses reported at year end

paid_loss

paid losses and allocated expenses at year end

ep_dir

Premiums earned at incurral year - direct and assumed

ep_ceded

Premiums earned at incurral year - ceded

ep_net

earned_prem_dir - earned_prem_ceded

single

1 indicates a single entity, 0 indicates a group insurer

posted_reserve_97

Posted reserves in year 1997 taken from the Underwriting and Investment Exhibit Part 2A, including net losses unpaid and unpaid loss adjustment expenses

train_ind

A 1 indicates that the value associated with a particular origin-dev combination would fall in the upper-left of a typical loss triangle. * indicates what would typically be indentified as a future observation at the time triangles are constructed. The lower right triangle values can be treated as a holdout set to test the adequacy of a reserve estimator.

The following table provides a description of the type of losses associated with each unique combination of loss_key and loss_type_suffix:

Loss Description

loss_type_suffix

loss_key

description

D

wc

Workers Compensation

R

prod_liab

Products Liability - Occurance

B

pp_auto

Private Passenger Auto Liability/Medical

H

othr_liab

Other Liability - Occurrence

F

med_mal

Medical Malpractice - Claims Made

C

comauto

Commercial Auto Liability/Medical

A copy of the CAS Loss Reserving Database is distributed along with trikit, in addition to a number of convenience functions intended to simplify database interaction. In what follows, CAS Loss Reserving Database convenience functions are detailed.

Loss Reserve Database API

load_lrdb(tri_type=None, loss_type='incurred', lob='comauto', grcode=1767, grname=None, train_only=True)

Load the CAS Loss Reserving Database subset of losses. Additional keyword arguments are used to subset the CAS Loss Reserving Database to the records of interest. Within the Loss Reserving Database, “lob” and “grcode” uniquely partition losses into 100 record blocks if lower_right_ind=True, otherwise losses are partitioned into 55 record blocks. All available combinations of “lob” and “grcode” (referred to as “specs”) can be obtained by calling get_lrdb_specs. Note that when tri_type is “cum” or “incr”, the remaining subset of records after applying lob, grcode and grname filters will be aggregated into a single triangle of the specified type.

Parameters
  • tri_type (str) – If None, lrdb subset is returned as pd.DataFrame. Otherwise, return subset as either incremental or cumulative triangle type. Default value is None.

  • lob (str) – Specifies the line of business to return. Available options are ['comauto', 'ppauto', 'wkcomp', 'medmal', 'prodliab', 'othliab']. Default value is “comauto”.

  • grcode (int) – NAIC company code including insurer groups and single insurers. For a full listing, call get_lrdb_specs. Default value is 1767.

  • grname (str) – NAIC company name (including insurer groups and single insurers). The complete mapping of available grcodes can be obtained by calling get_lrdb_specs. Default value is None.

  • loss_type (str) – Specifies which loss data to load. Can be one of “paid” or “incurred”. Defaults to “incurred”. Note that bulk losses have already been subtracted from schedule P incurred losses. Default value is “incurred”.

  • train_only (bool) – If True, the upper-left portion of the triangle will be returned. The upper-left portion of the triangle typically consists of actual loss experience. If False, the squared triangle, consisting of 100 observations is returned. Default value is True.

Returns

Either pd.DataFrame, trikit.triangle.IncrTriangle or trikit.triangle.CumTriangle.

CAS Loss Reserving Database datasets are loaded via load_lrdb. A number of additional keyword arguments can be passed to load_lrdb. For example, to retrieve the subset of commercial auto losses for grcode=1767:

In [1]: from trikit import load_lrdb
In [2]: df = load_lrdb(grcode=1767, lob="comauto")
In [3]: type(df)
Out[3]: pandas.core.frame.DataFrame

Alternatively, the resulting subset can be transformed to a triangle of the desired type:

In [4]: tri = load_lrdb(tri_type="cum", grcode=1767, lob="comauto")
In [5]: tri
Out[5]:
          1       2       3       4       5         6         7         8         9         10
1988 110,231 263,079 431,216 611,278 797,428   985,570 1,174,922 1,366,229 1,558,096 1,752,096
1989 121,678 279,896 456,640 644,767 837,733 1,033,837 1,233,015 1,432,670 1,633,619       nan
1990 123,376 298,615 500,570 714,683 934,671 1,157,979 1,383,820 1,610,193       nan       nan
1991 117,457 280,058 463,396 662,003 865,401 1,071,271 1,278,228       nan       nan       nan
1992 124,611 291,399 481,170 682,203 889,029 1,101,390       nan       nan       nan       nan
1993 137,902 323,854 533,211 753,639 980,180       nan       nan       nan       nan       nan
1994 150,582 345,110 561,315 792,392     nan       nan       nan       nan       nan       nan
1995 150,511 345,241 560,278     nan     nan       nan       nan       nan       nan       nan
1996 142,301 326,584     nan     nan     nan       nan       nan       nan       nan       nan
1997 143,970     nan     nan     nan     nan       nan       nan       nan       nan       nan

Notice that with grcode and lob specified as above, the returned DataFrame contains 55 records as expected (recall that by default, train_ind is set to False, otherwise the shape of the returned DataFrame would be (100, 3).

CAS Loss Reserving Database lines of business can be listed by calling get_lrdb_lobs:

In [1]: from trikit import get_lrdb_lobs
In [2]: get_lrdb_lobs()
Out[2]: ['comauto', 'ppauto', 'wkcomp', 'medmal', 'prodliab', 'othliab']

Unique combinations of “loss_key”, “grname” and “grcode” can be listed by calling get_lrdb_specs:

In [1]: from trikit import get_lrdb_specs
In [2]: get_lrdb_specs()
Out[2]:
    loss_key  grcode                               grname
0    comauto     266              Public Underwriters Grp
1    comauto     337                   California Cas Grp
2    comauto     353                       Celina Mut Grp
3    comauto     388                   Federal Ins Co Grp
4    comauto     460                      Buckeye Ins Grp
..       ...     ...                                  ...
431   wkcomp   41580                    Red Shield Ins Co
432   wkcomp   42439                Toa-Re Ins Co Of Amer
433   wkcomp   43915                       Rainier Ins Co
434   wkcomp   44091  Dowa Fire & Marine Ins Co Ltd Us Br
435   wkcomp   44300                   Tower Ins Co Of NY

Footnotes

1

https://www.casact.org/research/index.cfm?fa=loss_reserves_data