Category Archives: Learning

How did I changed ProjectDocs

This is my first post as part of “Experience” series. In these posts I’ll discuss on my previous projects about why I worked on it, how I made them, what mistakes I made and my learning while working on them. Hope you will enjoy these boring stuff.

“ProjectDocs” is my first application that i worked on in Omni Offshore. It is 2010, July 1st, my first day in Omni Offshore. I completed the formal introduction and received my ID card and regular stationary like pencil, notebooks and etc. and my manager shown my work space. I was relaxed and had a brief chat with my co-workers and learned about their hobbies and interests. Then comes my manager with few documents and introduced me yo my first project to maintain and it is “ProjectDocs”

I quickly gone through the BRD (Business Requirements Document), a kind of SRS. Let me explain briefly about the project. It is ASP.NET application but depends on Documentum from EMC. Basically this system allows users from shipyard to review drawings from engineering team and make comments on them if they are not clear. Once all clear they use them to build ships. By the way, our company builds big ships for oil and gas companies. We call them FSOs (Floating Storage and Off-take) and FPSOs (Floating Production Storage and Offloading).

So this gave me the opportunity to learn more about Documentum, DQL, DFC and DFS. The system is already in production and I received a list of issues and enhancement requests from business users. I went through the code and started to understand what all about Documentum. I sat with our Documentum resource to learn more about its structure and its query language (DQL). Made few POCs to understand the interfacing part between ASP.NET and Documentum using DFC (Documentum Foundation Classes). Within a week time I was ready to take on those issues and enhancements.

Documentum is a self contained JAVA based system. It hold metadata as well as content of the documents. It supports organising the documents by doctypes and each doctype is defined by its attributes. Consider each doctype is a table in DBMS and attributes as columns for the table. Documentum exposes an API by means of DFC or DFS. DFC is a COM component  that should be installed on each client machine. DFS are services exposed from Documentum server. We also can use DQL, a query language to retrieve or update the data from the Documentum system. Initially our ASP.NET application communicates Documentum through DFC.

When I was gone through the code I found that our ASP.NET application is nothing but a wrapper around Documentum functionality and there are many pull and push requests between Documentum and the application. Also I found that the system is very tightly coupled with shipyard name in such a way that they have to create virtually every thing from Documentum structure to ASP.NET application if there is a new shipyard to deal with. In fact that’s what happened, they have 4 such applications with different code base to deal with different shipyards. That means any change have to be done manually on those code bases and deploy separately to the production server.

I concluded that, with the current setup its not possible to make the changes. Also since there are many pull and push requests between Documentum and our application the overall responsiveness is worst and it is getting only to much worse. So I talked to our manager and said we needs to seriously think about this and asked to have a meeting with all the stakeholders.

There are 10 participants in the meeting room and one of them is our GM. That’s hardly one week of me taking the charge and my GM is interested in my opinion. That made me aware of the importance of the system as well as the risk in my recommendations. But I did proceed to the meeting room any my hands and shaking a bit as I am thinking about what to say. Everybody looked at my manager who started the meeting and listed all issues with the current setup. Now all looking at me for my opinion and i said “scrap current setup and build a better system from ground up”. My GM seems not convinced but agreed anyway after couple of discussions.

Now the interesting part, design the new system. I still took Documentum into consideration as that is our central repository of documents and many workflows running on those documents. But this system is internal and becuase of that we need ASP.NET application to interface between public user (shipyard user/Engineering team) and Documentum. The change I made was, keep Documentum only to store documents and moved system data to Oracle. Because of this, the whole business process like commenting and approvals happens on ASP.NET system and final documents will go to Documentum. Due to this, unless I need the document the system does not talk to Documentum where as previously the whole business process is on Documentum.

This change brought flexibility in maintaining the system as you can create as many shipyards you want and assign users to represent them. Due to this we don’t need to touch Documentum every time we have a new shipyard to introduce. As a byproduct to this whole exercise I have a robust understanding over Documentum, DQL, DFS and DFC. This way I am able to bring flexibility and speed into the system which helped the end users. It also helped our IT department in terms of maintenance and reduced the dependency on Documentum resource. And finally reduced cost of ownership to the company as system become very popular so that all our shipyards started using them.

ORACLE: View compiled with errors

Many times when ever we try to create the views under Oracle, often end up with messages like “View created with warnings” or ” View compiled with errors”. But SQL Developer does not reveal the error. In such cases to know the error actually generated fire the following query in SQL Developed or SQL Plus prompt;

SELECT TEXT FROM DBA_ERRORS WHERE NAME='VIEW_NAME_HERE'

If you have same view name in more than one schema then put that target schema name as

AND OWNER='SCHEMA_NAME'

This table always contains most recent error raised on that object.

Enterprise IT Landscape

For last 6 years I am working as a solution architect for an offshore company. As an architect I am responsible to design the application landscape of the company to proactively fulfil the computing needs of its employees.

When I first joined the company, I was assigned to look into an existing system to add new features. It is based on ASP.NET but heavily depended on documentum for all it’s data needs. I was horrified by the idea of using documentum as database and expecting fast responding system. I explained same to my manager and called for a meeting. I was bit nervous about what I am going to say. Our director asked for the recommendation and I said, “Scrap the existing one and build a new one with better architecture”. Seems, he was bit disappointed but after I presented my case he finally gave his go ahead.

Later I took steps to abolish few independent systems and integrated their data sources to build portals for each department. I even provided API services for any requirement where one portal needs data from another.

All this experience took me to the idea of marking the application landscape of an enterprise. With my little experience and understanding I want to classify application landscape of an enterprise into following;

Computer Generation:

In this generation of application, the main idea is to just capture the data and present the data back to user. System does not do any transformation of data but heavily depend on the end-user skills to transform the data into meaningful insights.

User keeps on exporting data into Excel sheets to work on it online. Enterprise of this generation will have lots of independent application with different characteristics like programming language used, API versions, database structure or event databases used (trust me, many applications use SQL Express of Access DB) and etc. It’s a chaos and any one who want to do some thing about it shall face an overwhelming changes to make. Usually we end up by saying, it’s fine as long as it does not break.

This hinders the progress and imagination of what can be achieved with all the data being captured. Then we actively look for alternative off-the-shelf solution to replace all this mess.

Computing Generation:

This is slightly better than previous. In this case, users are more interested in searching and retrieval of data. Transformation of data will be more common across the applications. More emphasis will be put on how easily user can retrieve data.

It stops there, still we have plethora of applications to maintain which are living in their own silos. These will be more friendly to end users than the developers. Still users are looking for off-the-shelf product to replace them.

Analytical Generation:

This is the start of the idea to integrate the applications. People are thinking about portals, common data sources, common look & feel and navigation across the portals. Independent application either abolished to have more integration or integrated into portals.

It’s new life to both users and dev’s. For user its same look and feel, navigation across the site. Data is more easily accessible and system provides insights on the problem by bringing user attention to various data points. For the first time user spends more time on working towards solving a problem than data.

Applications in this generation provide information in terms of graphs, charts than endless rows of data. Users talk in terms of KPIs than going through excel sheets.

Intelligence Generation:

Systems in this age are well automated. They proactively works on models and provide feeds to users. Users spend their time to build future based on feeds they get from system. Applications of this age influence the decisions and supports the decisions.

These systems constantly monitor and analyse the data in the background. They will have access to various data sources to pull the data and run AI algorithms to prepare important feeds to the users. They shows recommendations to the end users. They proactively empower the users with right data to solve a problem.

As an architect, my endeavour is to bring applications into at least analytical generation if not intelligence generation.

Power Shell: Remote session

This is a very common activity for any system administrator to remotely connect a computer and perform the administrative tasks. PowerShell offers an excellent opportunity to do exactly the same.

To remotely connect to any computer use the below command in the PowerShell

Enter-PSSession -ComputerName XXXX -Credential YYYY

Here XXXX is the computer name or IP of the remote computer and YYYY is the user name with which you want to login. This delivers the remote PS session to the remote computer if the given computer is in TrustedHosts list of WINRM. If not Powershell throws an exception as below;

image

In that case, we need to add the remote computer to the trusted hosts of WINRM. To do that, issue the below command;

winrm s winrm/config/client '@{TrustedHosts="RemoteComputer"}'

Here the “RemoteComputer” is the name or IP of the computer to which you want to connect.

image

Now execute the Enter-PSSession command to invoke a remote PS session to the remote computer.

image

Now you have the remote command shell (Power Shell)

Powershell: How to manage Hyper-V VMs in Windows 10

I am using Windows 10 since it’s first release. So far I am happy with it and had no issues. With it comes Hyper-V to deploy VMs. I normally do not install 3rd party tools until its absolutely necessary. Therefore I started using Hyper-V to run my DB server as well as Linux servers.

To manage VMs under Hyper-V, we have two options. One is to use “Hyper-V Manager” and second one is to use “Power Shell”. First one is pretty common and easy to use, but that second method is very interesting. The following post is about what i have learned so far the commandlets to work with Hyper-V. I know the following is not full information but I keep updating this post with latest information. So lets jump in;

Show all VMs under Hyper-V:

To list all the VMs available under Hyper-V, use the following command;

 get-vm

This command list all the VMs available under Hyper-V. Here is the output;

image

In my case  I have Windows 2003 as my DB server and Ubuntu 15.10. Here, VM is short form of Get-VM.

Start a VM:

To start a VM, fire the following command;

start-vm win2k3

The above command starts the VM with the name “win2k3”. Powershell shows a progress bar on top of the screen. The following is the current status of my VMs. Look at the status of VM WIN2K3;

image

Save a VM:

In order to save the current VM state, simply issue the below command;

 save-vm win2k3

This command saves the current status of the VM and again Powrshell shows a progress bar to indicate the progress. Now look at my current state of VMs

image

Stop a VM:

Now you pretty much guessed the next command;

stop-vm win2k3

The above command, stops the running VM. If the current sate of the VM is “Saved” and if you issued this command to stop the VM then Powershell prompts you with the following;

Confirm

Hyper-V can’t shut down virtual machine “WIN2K3” because the Shutdown integration service is

unavailable. You can turn off the virtual machine by selecting [Y]es, but this is similar to

pulling the power on a physical machine. To avoid potential data loss, select [N]o, then pause

or save the virtual machine. Do you want to turn off the virtual machine?

[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [S] Suspend  [?] Help (default is “Y”):

Therefore make sure the VM is in running state before you issue this command. Now check my current VM state;

image

Take a Snapshot

To take the snapshot fo the VM, issue the following command;

 checkpoint-vm -VMName win2k3

For more awesome Hyper-V commands visit here.

 

 

ConEMU, A better way for console management in Windows

For server administrators and deep dive developers console is a vital helper. Some times it is very hard to maintain more than two consoles on the screens. Since they consume a good amount of real-estate on the desktop, it is very hard when you want to run background scripts or self-host WCF applications. The good news is there are few alternatives to default console on the Windows desktop or server. I am regularly using one of them and following lines are some brief account on it.

ConEMU

ConEMU is a short form from Console Emulator. This is really a versatile emulator where you can run multiple console application in tabs.

image

You can add your own console applications as tasks in this emulator. Once you added, they are available right on the screen to kick off. Also you can run any of the console application as Administrator.

image

I normally run, default console, admin console, FAR Manager and SQL Plus for my oracle database concurrently for my regular operations.

image

Another useful feature is, you can copy and command output on this console by just selecting the lines. Once you release the mouse, the selected lines will be placed into clipboard and ready to past anywhere you want.

You can download ConEMU from here.

I will update this post with details about different tasks that we can do this this console emulator.

PowerShell, Working with MS Access Database

This is my second post on PowerShell, continuing my earlier post. The more I am working with PS, more i am getting the interest. For an administrator and geeky programmer the possibilities are endless. Today my aim is to connect to MS Access, read the table and create another table with unique records. Initially i thought to use SQL in MS Access to remove the duplicated by grouping them and taking top 1. But the issue is, i can get unique records only for “Doc Number” and “Doc Revision” columns of the table (check my  previous post). As soon as i include all other columns i could not get unique records. Now, i left with one option that is write C# code to do that job. But i want to experiment with PowerShell to do that job.

Once you install PowerShell, you get two application;

  1. PowerShell
  2. PowerShell ISE (Integrated Scripting Environment)

ISE looks like below and this is where you can write shell script for your needs. Of course, you can use any editor for that matter to write script. But using ISE give you syntax highlighting, intellisence, Commandlets help and also a shell to test your script.

image

As usual, I divided my whole problem into small milestones to achieve progressively. The following are my milestones to meet;

  1. Connect the MS Access MDB file through OleDB
  2. Create a dataset with two tables, one with schema (DT1) and another one with data (DT0).
  3. Go through each row and check if DT1 has the record or not and if not then insert the current record
  4. Exporting DT1 to excel so that, i can use it further

With this milestones, I pulled up my ISE and started scripting for first milestone.

Connecting to MDB file:

To achieve this, i need three variables to store 1. path to MDB file, 2. store the SQL command and 3. to store the connection string. In PowerShell a variable is denoted with “$”, check the following code;

$dataSource = "C:\Users\pradeepy\Documents\uote_docs.mdb"
$selectQuery = "SELECT * FROM FULL_DOCS_INFO"
$dsn = "Provider=Microsoft.Jet.OLEDB.4.0; Data Source=$dataSource"

Here, “$dataSource”, “$selectQuery” and “$dsn” are variables that i am going to use. Since my variables are ready, now i need to connect to the database. The following code will do the job;

Write-Host "Creating connection to database" -ForegroundColor Cyan
$dbCon = New-Object System.Data.OleDb.OleDbConnection $dsn
$sqlCmd = New-Object System.Data.OleDb.OleDbCommand $selectQuery, $dbCon
$dbCon.Open()

In this code, “Write-Host” is a commandlet that we can use to print some thing onto the screen. We can also set the foreground and background colours to the text. In this case, i added “Cyan” colour to the text. Now i should able to run the code and I ran the code to only hit back by an exception;

image

It seems, I cannot access OleDB provider for MS Access. Immediately I checked my ODBC driver by firing the command Get-OdbcDriver. This list all the drivers installed  and found my Access Drivers. That means it is not the issue with drivers but some thing else. A quick bing search revealed that, this is a typical issue if you run ISE in 64-Bit mode. Ok, lets close and open 32-Bit version of ISE and run the script. Now its working fine.

Note:

In some cases, system might throw an exception saying “cannot run the script due to script execution policy”. In this case you can bypass the policy by following commandlet;

Set-ExecutionPolicy -ExecutionPolicy Bypass

This will bypass the policy so that you can run the script. To set it back fire the above command again after replacing the “Bypass” to “Restrict”.

Creating the DataSet:

Now, second milestone. We need two datatables to handle the data. One is for data from MDB file and another one is unique records. Following code will do that;

Write-Host "Creating datasets" -ForegroundColor Cyan
$dataSet = New-Object System.Data.DataSet
$sourceDataTable = New-Object System.Data.DataTable "SOURCE_UOTE_DOCS"
$finalDataTable = New-Object System.Data.DataTable "FINAL_UOTE_DOCS"
$dataSet.Tables.Add($sourceDataTable)
$dataSet.Tables.Add($finalDataTable)

Once the datatables are ready, we need to pull the data into the DT0. Below code will do that;

Write-Host "Fill schema of the table into a data datable" -ForegroundColor Cyan
$dataAdaptor = New-Object System.Data.OleDb.OleDbDataAdapter $sqlCmd
$dataAdaptor.FillSchema($dataSet.Tables[1], [System.Data.SchemaType]::Mapped)
[void] $dataAdaptor.Fill($dataSet.Tables[0]);

Write-Host "Closing the connection" -ForegroundColor Cyan
$dbCon.Close();

Remove duplicate records:

It’s time to load only unique records. To do that, for each row in the DT0 we check if DT1 has any row with same Doc Number and REV. If not then put the row into DT1. In the following code, we are piping the DT0.Rows object to ForEach-Objectloop. In this loop, we try to select a records with current object’s DOCUMENT_NUMBER column and Doc Revision column form DT1. If the count is 0 then insert else just log it. Have a look at below code;

$duplicateRecordsCount = 0
$uniqRecordsCount = 0
$dataSet.Tables[0].Rows | ForEach-Object{
    If($dataSet.Tables[1].Select("[Doc Number] like '"+$_["Doc Number"]+"' and [Doc Revision] like '"+$_["Doc Revision]+"'").Count -eq 0){
        
        $uniqRecordsCount++
        $newRow = $dataSet.Tables[1].NewRow()
        $newRow[0] = $uniqRecordsCount ##  ID
        $newRow[1] = $_[1]             ##  Name
        $newRow[2] = $_[2]             ##  DirectoryName 
        $newRow[3] = $_[3]             ##  FullName
        $newRow[4] = $_[4]             ##  Doc Name
        $newRow[5] = $_[5]             ##  Doc Number
        $newRow[6] = $_[6]             ##  Doc Revision
        $newRow[7] = $_[7]             ##  PROC_CODE
        $newRow[8] = $_[8]             ##  SUB_PROC_CODE
        $newRow[9] = $_[9]             ##  DOC_REV
        $newRow[10] = $_[10]           ##  DOC_REV_EXT
        $newRow[11] = $_[11]           ##  EXT

        $dataSet.Tables[1].Rows.Add($newRow)

        $rowCmt = "New row added...["+$dataSet.Tables[1].Rows.Count+"]"

        Write-Host $rowCmt  -ForegroundColor Green
    }
    Else {
        $duplicateRecordsCount++
        Write-Host "Already exists..."  -ForegroundColor Red
    }
}

In the above statement, $_ represents the current object in the loop. PS don’t have any quality operator like “==” in C#, but it uses “-eq” for equality check. Once the loop reaches to end, we will have unique records in DT1.

Export to CSV:

Now our last milestone, pushing them into EXCEL sheet. Follow the below code;

Write-Host "Exporting to excel " -ForegroundColor Cyan
$dataSet.Tables[1] | Export-Csv c:\logs\new_uote_docs.csv

With this, I am able to create a CSV file to be used in my JAVA program to upload into Windchill.

The following is teh complete code that I used to do this.

$dataSource = "C:\Users\pradeepy\Documents\uote_docs.mdb"
$selectQuery = "SELECT * FROM FULL_DOCS_INFO"
$dsn = "Provider=Microsoft.Jet.OLEDB.4.0; Data Source=$dataSource;"

Write-Host "Creating connection to database" -ForegroundColor Cyan
$dbCon = New-Object System.Data.OleDb.OleDbConnection $dsn
$sqlCmd = New-Object System.Data.OleDb.OleDbCommand $selectQuery, $dbCon
$dbCon.Open()

Write-Host "Creating datasets" -ForegroundColor Cyan
$dataSet = New-Object System.Data.DataSet
$sourceDataTable = New-Object System.Data.DataTable "SOURCE_UOTE_DOCS"
$finalDataTable = New-Object System.Data.DataTable "FINAL_UOTE_DOCS"
$dataSet.Tables.Add($sourceDataTable)
$dataSet.Tables.Add($finalDataTable)

Write-Host "Fill schema of the table into a data datable" -ForegroundColor Cyan
$dataAdaptor = New-Object System.Data.OleDb.OleDbDataAdapter $sqlCmd
$dataAdaptor.FillSchema($dataSet.Tables[1], [System.Data.SchemaType]::Mapped)
[void] $dataAdaptor.Fill($dataSet.Tables[0]);

Write-Host "Closing the connection" -ForegroundColor Cyan
$dbCon.Close();

Write-Host "No. of rows in dataset >" -ForegroundColor Green
$dataSet.Tables[0].Rows.Count | Select-Object

$duplicateRecordsCount = 0
$uniqRecordsCount = 0
$dataSet.Tables[0].Rows | ForEach-Object{
    If($dataSet.Tables[1].Select("[Doc Number] like '"+$_["Doc Number"]+"' and [Doc Revision] like '"+$_["Doc Revision"]+"'").Count -eq 0){
        
        $uniqRecordsCount++
        $newRow = $dataSet.Tables[1].NewRow()
        $newRow[0] = $uniqRecordsCount ##  ID
        $newRow[1] = $_[1]             ##  Name
        $newRow[2] = $_[2]             ##  DirectoryName 
        $newRow[3] = $_[3]             ##  FullName
        $newRow[4] = $_[4]             ##  Doc Name
        $newRow[5] = $_[5]             ##  Doc Number
        $newRow[6] = $_[6]             ##  Doc Revision
        $newRow[7] = $_[7]             ##  PROC_CODE
        $newRow[8] = $_[8]             ##  SUB_PROC_CODE
        $newRow[9] = $_[9]             ##  DOC_REV
        $newRow[10] = $_[10]           ##  DOC_REV_EXT
        $newRow[11] = $_[11]           ##  EXT

        $dataSet.Tables[1].Rows.Add($newRow)

        $rowCmt = "New row added...["+$dataSet.Tables[1].Rows.Count+"]"

        Write-Host $rowCmt  -ForegroundColor Green
    }
    Else {
        $duplicateRecordsCount++
        Write-Host "Already exists..."  -ForegroundColor Red
    }
}

Write-Host "No. of rows in new dataset >" -ForegroundColor Green
$dataSet.Tables[1].Rows.Count | Select-Object

Write-Host "Total duplicate records :" -ForegroundColor Red
Write-Host $duplicateRecordsCount

Write-Host "Exporting to excel :" -ForegroundColor Cyan
$dataSet.Tables[1] | Export-Csv c:\logs\new_uote_docs.csv

My Journey into PowerShell

One more feather added into my crown, recently I was started using PowerShell from Microsoft. The requirement is simple, our company has a huge pile of documents for a project and they want to put all of them into our newly commissioned document management system named Windchill. So, our consultants gave us a JAVA console application to load the documents into Windchill. That tool required a CSV file with predefined columns. So it all fall onto my head to prepare a CSV file. I was given the location where those documents are sleeping and I went there to only make my self sick. There are just 18000+ documents….. Since I have no other options, I was thinking about how I can prepare the CSV file out of these. That CSV file should have following,

  1. Document Name
  2. Location of document
  3. Document Number
  4. Document Version
  5. Process Code
  6. Sub-Process code
  7. Client document number

So how do I get these? going through each document and note down the details is not an option that I want to exercise. I am a programmer at heart, fellas… I want to do things simple enough so that I can manage.  With some keen observation I found that our engineering guys made my life much easier. They put all of them in the document name itself. So all I need is to parse the document name and put all that information into a CSV file. That’s easy I thought myself. All I need to write a program to do that. Now that’s the problem, my be I was too board of C# or may be I saw a geek video last night, I want to do it in much simpler way that too using some script. There, at that time I saw a blue icon with a command prompt symbol laying on my task bar. Its a long forgotten tool and I open it only when I saw any computer geek movies to only close it once more time. It is PowerShell from Microsoft. Ok, now it seems I got a solution for my problem problem. A quick web search revealed that this little guy can do many things that I want. so it’s time to learn new things. For me learning new things become a hobby and Its pretty exciting to learn new things. I started organizing my requirements, what I want from PowerShell, the algorithm…

  1. Go through all the folder recursively and read all document names
  2. List Name, revision, document number and physical path to the document.
  3. Push them into a CSV file and I am done….

Now my milestones to achieve this,

  1. List document names by recursively reading through the folders
  2. Check how to parse the document names and list them
  3. Finally push them into a CSV file

Listing document names recursively,

Fall back to my favourite search engine, BING and learned that “Get-ChildItem” will do the job for me. Here is my output Get-ChildItems

Ok, now I achieved my first goal and moving onto my next one. This time I want to list the documents recursively. Its easy part, “Get-ChildItem -recurse” will do the job. That’s cool, I crossed my second milestone too. Now my third step, how to sub-sting a string? We can achieve this by using “substring()” function. This function takes two parameters, one is from what index to start and another one is length of string to extract. See the following output, SubStirng

Ok, going good. Now I need to pump the document name to substring and list it as individual columns. But wait, my output is not correct, I want only two things, document name and document physical path. Once I have these two, I need to extract remaining columns from document name. One more things, I don’t want to bring the folder names but only files within them. So, its time to play with “Get-ChildItem“.

My immediate thing to do is exclude the directories, I can get this by “Get-ChildItem -Attributes !Directory“. Lets see the output;

Original:

Get-ChildItem-All

Here all highlighted ones are folders in the current folder. Now lets apply those switches;

Get-ChildItem-NoDIr

They are gone… Great now its time to change the output. I want only name of the document and full path to it. We can achieve this by piping the output into another commandlet, named, “Select-Object“. Here is the output;

Get-ChildItem-NoDIr-NameFull

At this point, I am able to get name of the document and full path to it. Now I need to extract the desired columns, remember I need to get document number, version.

Now a little confession to make, the output till now you are seeing in the above screenshots are not from production server. Therefore the names are not accurate. Actual file names are like this “TP123-8080-DWG-0001-Rev0 Some Name.Ext“. In this name, I am interested in first part of the file name. That contains the document number and revision and hopefully they are accurate. So all I need is to extract the file content till the first space. Let’s see how we can achieve this.

To make a substring from the input from Select-Object we write an expression as “{$_.Name.SubString(0, $_.Name.IndexOf(‘ ‘))}“. Look into the output;

Get-ChildItem-Expression_1

If you observe closely, the column name is nothing but our expression. But that’s not what we want, let’s give it a name. For that purpose, we need to change the expression slightly to “@{Name=”DOC_NUMBER”; Expression=$_.Name.SubString(0, $_.Name.IndexOf(‘ ‘))}}“. Following is the output;

Get-ChildItem-Expression

Here I need to refine the DOC_NUMBER expression to remove revision part. This is because revision number is not part of the document number. My refined expression will be “@{Name=”DOC_NUMBER”; Expression={$_.Name.SubString(0, $_.Name.SubString(0, $_.Name.IndexOf(‘ ‘)).LastIndexOf(‘-‘))}}

Great!!! Now lets add revision number to our list. To achieve this, we reuse the document number expression and take everything after last index of “-“. Since, the document number is has fixed length, i.e. 19 characters we can substring from 19th character to rest. Here is my expression “@{Name=”DOC_REVISION”; Expression={$_.Name.SubString(0, $_.Name.IndexOf(‘ ‘)).SubString(19)}}” and following is the output;

Get-ChildItem-Expression-REv

Now I am almost done, all I need is to export this to CSV. To do that we pump this output to another commandlet named, Export-Csv and passing the name with path. Here is my final PowerShell command and its output;

Get-ChildItem -Attributes !Directory -Recurs | Select-Object Name, FullName, @{Name=”DOC_NUMBER”
; Expression={$_.Name.SubString(0, $_.Name.SubString(0, $_.Name.IndexOf(‘ ‘)).LastIndexOf(‘-‘))}}, @{Name=”DOC_REVISION”
; Expression={$_.Name.SubString(0, $_.Name.IndexOf(‘ ‘)).SubString(19)}} | Export-Csv c:\logs\exp.csv

Excel-csv

That’s it fellas, I hope you learned something today. What I realized is, PowerShell is really powerful. It is integrated into system so much so that we can do virtually anything in the system.