The project is straightforward data analysis. There is one important constraint: I NEED IT WITHIN 24 HOURS. I will not provide the input files. I can, however, provide test files. The deliverable is an easy to execute program that I can run on Microsoft Windows, in which I specify the two input files and the program provides me the two output files in .csv or excel format. The two input files are datasets.
The primary dataset is firm-year observations. For each firm-year, the firm is in one or more industries. So the primary dataset looks like this:
Firm – Year – Industry1 – Industry2 – Industry3- … - Industry25
001009 – 1991 – 1123 – 1148 – 1156
001009 – 1992 – 1123 – 1148 – 1156 - 1185
001009 – 1993 – 1148 – 1185 – 1123 - 1244
002011 – 1991 – 1148 – 1214
002011 – 1992 –2568 – 1148
002011 – 1993 – 2568 – 1477
002055 – 1993 – 1244
002055 – 1994 - 1244
The actual dataset has over 1,500 firms and 12 years. Not all firms have observations for the same years. The dataset is sorted by firms, then years.
The second dataset, the “LIST’ dataset, is simply a list of industries, called industryX (~500 industries)
Example:
IndustryX
1185
1244
1477
1666
The first step is to generate a new dataset from firm-year observations to firm-year-industry observations by mapping each firm-year couple to each single industry in the LIST dataset. NOTE: the first year for each firm is NOT included. IndustryX is an industry that is part of the “LIST” dataset but not one of the industries in firm’s list for the previous year.
So using the examples above the new dataset will be:
Firm – Year – IndustryX
001009 – 1992 – 1185
001009 – 1992 – 1244
001009 – 1992 – 1477
001009 – 1992 - 1666
001009 – 1993 – 1244
001009 – 1993 – 1477
001009 – 1993 – 1666
002011 – 1992 – 1185
002011 – 1992 – 1244
002011 – 1992 – 1477
002011 – 1992 – 1666
002011 – 1993 – 1185
002011 – 1993 – 1244
002011 – 1993 – 1477
002011 – 1993 – 1666
002055 – 1994 – 1185
002055 – 1994 – 1477
002055 – 1994 – 1666
Let’s call this new dataset “NEW”
The second step more complex. Using the primary dataset and the “NEW” dataset create a new dataset that indicates whether or not a firm entered an industryX by screening the list of industries for each firm-year observation in the primary dataset and comparing each year to the year before. If the respective industry was entered (i.e. new), then mark it with a 1. Otherwise mark it with a zero. In order to do that, of course, you need to create a new dichotomous indicator (let’s call it entered). Important: Note that the order of industries may change across years for a specific firm. The order should not matter. As long as the industry is already part of the list of the industries for a specific firm, then it is not new.
An important issue here is that if the firm enters a new industry that is not an industryX, it will not be in the final dataset anyways. Remember, the final dataset is derived from the "NEW" dataset with the new indicator added.
So the first deliverable is the final dataset. For the example above, the final dataset will be:
Firm – Year – IndustryX - Entered
001009 – 1992 – 1185 - 1
001009 – 1992 – 1244 - 0
001009 – 1992 – 1477 - 0
001009 – 1992 - 1666 - 0
001009 – 1993 – 1244 - 1
001009 – 1993 – 1477 - 0
001009 – 1993 – 1666 - 0
002011 – 1992 – 1185 - 0
002011 – 1992 – 1244 - 0
002011 – 1992 – 1477 - 0
002011 – 1992 – 1666 - 0
002011 – 1993 – 1185 - 0
002011 – 1993 – 1244 - 0
002011 – 1993 – 1477 - 1
002011 – 1993 – 1666 - 0
002055 – 1994 – 1185 - 0
002055 – 1994 – 1477 - 0
002055 – 1994 – 1666 - 0
The second deliverable is the "count" dataset. Using the "Final" dataset, for each firm-year observation, return the count of entries (i.e. the sum of "Entered" for each firm-year couple).
For our example, the "count" dataset is:
Firm - Year - Count
001009 - 1992 - 1
001009 - 1993 - 1
002011 - 1992 - 0
002011 - 1993 - 1
002055 - 1994 - 0
Hello, how are you.
I have experience on VB.Net VC++, C# and MFC.
I have experience in developing the MCU embedded system with C,C++ language with PIC, RENESAS, NEC....
I have designed the concrete plant control system, Power Line Communication system and many
project. I also have experience in developing the windows application using Java, VB.Net and MFC.
And i designed the PHP e comencial page.
I can help you on this project. Thank you very much.
$111 USD in 3 days
5.0 (13 reviews)
4.5
4.5
4 freelancers are bidding on average $182 USD for this job
Hello,
Before you select a part time developer from here, take a look at our portfolio: fugacode.com. If you like what you see, contact us. That's all.
"Why hire part time college students? when you can hire professional developers for the same cost"
Regards,
FUGACODE Team
PS: Freelancer milestone system is used.