I want to share two things one is splitting your input dataset or file into multiple files or datasets using SAS. And, you can delete particular records from the input dataset and write remaining into the output file. You can apply the logic in your SAS macros.
Since file handling in SAS is very much important if you want to work as a data science developer in financial analytics.
Your input observations are critical in data analysis
1). Splitting a Dataset
The below example is having one input dataset and two output datasets. Your input file contains many observations. The requirement is you need to split into multiple files or datasets.
DATA New-Dataset-Name-1 (OPTIONS) New-Dataset-Name-2 (OPTIONS); SET Old-Dataset-Name (OPTIONS); IF (insert conditions for Dataset1) THEN OUTPUT New-Dataset-Name-1; IF (insert conditions for Dataset2) THEN OUTPUT New-Dataset-Name-2; RUN;
The ‘SET’ option in SAS is to read input files. To understand options just read here full details.
Real-time example in SAS to split a files into multiple files.
DATA freshmen sophomores juniors seniors; SET sample; IF (Rank = 1) THEN OUTPUT freshmen; IF (Rank = 2) THEN OUTPUT sophomores; IF (Rank = 3) THEN OUTPUT juniors; IF (Rank = 4) THEN OUTPUT seniors; RUN;
In the above example, different rank holders (1, 2, 3, and 4) will have different files. Total four output files you can see in the above example.
2). DELETE Records From the Input File
Here, you can delete matching records from the input dataset. It is also called self-updating files in SAS. The keyword ‘OUTPUT’, if you do not mention, then your input file will be updated.
DATA New-Dataset-Name (OPTIONS); SET Old-Dataset-Name (OPTIONS); IF (insert conditions) THEN DELETE; RUN;
You can see a real-time best example to delete matching records in the input file using SAS.
DATA sample_small; SET sample; IF (Rank = 1) THEN DELETE; RUN;
Now, the ‘sample’ file does not contain records matching to Rank=1