There are a number of List transformations available in Power Query’s graphical interface. However, the number of functions in the graphical interface is very limited. In this post, I’m going to explain about a function that is really powerful and is not yet listed in the graphical interface. List.Accumulate is a function that loops through items of a list and applies a transformation. This function at the time of writing this post is only available through Power Query M scripting. If you like to learn more about Power BI read Power BI book from Rookie to Rock Star.
List Transformations in Graphical Interface of Power Query
If you have a list, you can see what transformations are available to be applied to it using the graphical interface. you can create a list with a simple M script as below;
generate a list
This code generates a list of numbers from 1 to 12. Now you can see in the top menu under List Tools, that there are a number of transformations available for List. These items are very few options such as; convert to a table, keep items, remove items, and etc. Altogether considering all options in every element this list is not more than 20 functions. However, let’s look at the number of List functions in M script.
List Functions in M; Power Query Formula Language
I have written previously about the usage of the #shared key to finding all functions available in M. you can then filter it to only “List.” Functions and then you will end up with a bit list of functions; 69 functions! more than 3 times of what you see in the graphical interface.
Some of these functions Some of the functions in this list are very useful and powerful. Example of those functions is List.Dates, List.Numbers, List.Generate, List.Accumulate and many others. Obviously, we cannot go through all functions in one blog post. In this post, I’ll be covering List.Accumulate and in future posts, I’ll talk about other functions.
List.Accumulate Function
List.Accumulate is a function that can easily save a number of steps in your Power Query transformations, instead of applying multiple steps, you can simply use List.Accumulate to overcome what you want. List.Accumulate function loops through the list and accumulate a value as a result. This function needs usually three parameters; the list itself, seed, and accumulator. Here are parameters explained in details;
- list; the list that we want to apply the transformation to it.
- seed; is the initial value.
- accumulator; is a function. this function determines what accumulation calculation happens on items of the list. the way that this function is defined is exactly the way that you write a function in Power Query M script using Lambda expressions.
best way to learn about seed and accumulator is through some examples, let’s apply some transformations with List.Accumulate and see how these two parameters are working.
Accumulate to Calculate Sum
The sum is a function that basically is accumulating every two values in the list till the end of the list. if you want to write Sum with List.Accumulate, you can do it with this expression:
Accumulate as Sum
state is the value accumulated in the calculation. current is the current item in the list. seed is the initial value of the state
Let’s see how the calculation works. To clarify it more in details, I explained the value of state, current, and accumulator in every step;
List.Accumulate loops through every item in the list and run accumulator function. the very first time, the state value is equal to seed. which in this case is zero. current is the current value in the list. For the very first item that value is 1. so accumulator result is state+current=0+1=1. this value then will be the state of the next item on the list. for the next item, the state is 1 (calculated from the previous row), and current is 2. state+current becomes 1+2=3. this process continues through the whole list, so the final state value for a list from 1 to 20, becomes 210, which is equal to the sum of those values. In every row of the list, we added that to the previous row’s result.
Accumulate to Calculate Max
learning how the accumulate function can cover basic tasks, help you to understand how accumulator function works. for applying Max, you need to compare every two items in the list and pick the one which is bigger. Here is the script;
accumulate as max
2 4 6 | (state,current)=> thenstate+1 ) |
the logic is simple, if the item matches with “a” (which in this case is our token), then count it, otherwise don’t
Accumulate as Count Token Partial Match
Similar to the previous calculation, however this time we want to count the item, even it partially matches the text. this can be done with the help of Text.Contains function. and because of Text.Contains might not find the lower case or upper case matches, we convert it to lower case beforehand.
accumulate as count token partial match
2 4 6 | (state,current)=> thenstate+1 ) |
Accumulate as Conditions on Records
So far, we went through a lot of use cases and examples of List.Accumulate function. You understand that this function can be so powerful and useful in many scenarios. However, the main use cases of List.Accumulate is to apply to scenarios which other functions cannot resolve easily. List.Sum might be better used than List.Accumulate which only calculates the sum. However, there are many scenarios that a number of steps are needed to get the result you want with normal functions. In those cases, List.Accumulate is your friend.
For example; consider the situation that you have a list, and this list is a list of records! every record might have a different set of fields, you want to fetch only records that have a specific field on their list, and get their position as a concatenated result. This process, using other list functions might take a number of steps, however, with List.Accumulate that is easy.
Here is the sample input list;
As you can see the list includes records, and to find out what each record has, you need to expand it. a list of records cannot be expanded because it will add the number of columns and list can have only one column. so you have to convert it to a table, and then expand it. As you feel now; there are a number of steps required to get the result we wanted. However, List.Accumulate can be a big help in this scenario. Here is the calculation with List.Accumulate;
Accumulate for conditions on records
2 4 6 | (state,current)=> thenstate&','&Text.From(List.PositionOf(Source,current)) ) |
Record.HasFields used to determine if a record contains a field (“A” in this example).
List.PositionOf used to get the position of that record which satisfy the criteria above.
Summary
This post explained how List.Accumulate works. List.Accumulate is a very powerful function, that can easily save a number of steps in your Power Query transformations. In this post, you’ve learned basics of this function with using it for simple operations such as Sum, Divide, Product, Max, Count and etc. You also learned that the main power of this function is when basic functions cannot operate easily. You learned that the accumulator function gives you full power to write exactly what you want. In future posts, I’ll write about other List functions that can be very powerful, but you still don’t have it in the graphical interface.
Have you liked the List.Accumulate function as I do? if yes, please share your story that how this function can be helpful for you in the comments below.
Related posts:
Trainer, Consultant, Mentor
Reza Rad is a Microsoft Regional Director, an Author, Trainer, Speaker and Consultant. He has a BSc in Computer engineering; he has more than 20 years’ experience in data analysis, BI, databases, programming, and development mostly on Microsoft technologies. He is a Microsoft Data Platform MVP for eight continuous years (from 2011 till now) for his dedication in Microsoft BI. Reza is an active blogger and co-founder of RADACAD. Reza is also co-founder and co-organizer of Difinity conference in New Zealand.
His articles on different aspects of technologies, especially on MS BI, can be found on his blog: https://radacad.com/blog.
He wrote some books on MS SQL BI and also is writing some others, He was also an active member on online technical forums such as MSDN and Experts-Exchange, and was a moderator of MSDN SQL Server forums, and is an MCP, MCSE, and MCITP of BI. He is the leader of the New Zealand Business Intelligence users group. He is also the author of very popular book Power BI from Rookie to Rock Star, which is free with more than 1700 pages of content and the Power BI Pro Architecture published by Apress.
He is an International Speaker in Microsoft Ignite, Microsoft Business Applications Summit, Data Insight Summit, PASS Summit, SQL Saturday and SQL user groups. And He is a Microsoft Certified Trainer.
Reza’s passion is to help you find the best data solution, he is Data enthusiast.
His articles on different aspects of technologies, especially on MS BI, can be found on his blog: https://radacad.com/blog.
He wrote some books on MS SQL BI and also is writing some others, He was also an active member on online technical forums such as MSDN and Experts-Exchange, and was a moderator of MSDN SQL Server forums, and is an MCP, MCSE, and MCITP of BI. He is the leader of the New Zealand Business Intelligence users group. He is also the author of very popular book Power BI from Rookie to Rock Star, which is free with more than 1700 pages of content and the Power BI Pro Architecture published by Apress.
He is an International Speaker in Microsoft Ignite, Microsoft Business Applications Summit, Data Insight Summit, PASS Summit, SQL Saturday and SQL user groups. And He is a Microsoft Certified Trainer.
Reza’s passion is to help you find the best data solution, he is Data enthusiast.
Last week I was teaching a Power BI Workshop. On the final day while attendees were building their own solutions, one participant named Linda says to me: 'Melissa, Power Pivot is missing some data.' We proceeded to trace back through her Power Query operations to determine where the data got eliminated.
She shows me the final Power Query and points to the column header and says 'See, there's no filters here. So why didn't it end up in Power Pivot?' Since this was a very interesting observation from a very bright gal, I thought I'd share it.
To illustrate, first let's set up a quick little list of Dept Numbers and Names in the first step of our Query Editor:
In the second step, let's add a filter to remove one Dept Number.
In our third step after the filter, let's do something. Anything. For grins I merged two columns to produce a concatenated column of Dept Name - Number. Really though, this third step only exists to prove a point that there is no longer an indication in the column header that a filter occurred on this column in a previous step:
I often say that each step in the Query Editor displays the state of the data at that point in time. Since the filter occurred on a previous step, the rows filtered out are now really gone in the next step - the state of the data changed. This is why there's no longer an indication a filter occurred in a previous step. Power Query behavior isn't like Excel where the rows are still there just hidden; rather, the rows are truly filtered out and won't progress further along in the steps or into Power Pivot.
So, having said all that, the two main takeaways are:
- Filters in Power Query aren't controlling what you see on the screen; they truly exclude the data from progressing any farther.
- There's not a visual indication in Power Query on steps after a filter operation has occurred, so keep an eye out for that if you are doing some data exploration as you build your steps.
You Might Also Like..
By: Koen Verbeeck | Last Updated: 2015-05-15 | Comments (6) | Related Tips: More >Microsoft Excel Integration
Problem
The tip Introduction to Power Query for Excel introduced us to the world of Power Query, a self-service ETL tool of the Power BI family. With this tool, a lot of different sources can be extracted into Excel with just a few clicks. This tip will show you how you can read a SharePoint list with Power Query and how you can extract its contents into Excel.
Solution
If you haven't installed Power Query yet, you can download the free Excel add-in. For a general introduction, please read the tip Introduction to Power Query for Excel. In this tip, I'm going to extract data from a SharePoint list located locally in a virtual machine. It contains a few Olympic disciplines and the sport they belong to.
Reading the SharePoint List
To read data from a SharePoint list, go to the Power Query ribbon in Excel, select From Other Sources and choose From SharePoint List.
The first step is to enter the URL to the SharePoint site. This doesn't have to be the exact URL to the list, the site itself is enough.
The second step is authentication. Depending on your set-up, this step behaves differently. If you can sign in into the SharePoint site using your current Windows credentials, this step might be skipped automatically. In my case, the SharePoint server is in another Active Directory, so I need to confirm how Power Query needs to access the SharePoint site. However, Anonymous and Organizational account are not supported, although these options are listed.
At the time of writing, the Power Query user interface currently has no option to specify in a different Windows account, but this might change in the future. When you hit Save, Power Query tries to connect with the current Windows account, which gives the following error:
A work-around is to first go to the SharePoint list with Internet Explorer. The browser will prompt you for credentials.
When you select Remember my credentials, the Windows user and password are stored locally on your computer and Power Query can use those to log into the SharePoint site. Another option is to save your set of credentials directly into the Credential Manager in the control panel.
After Power Query successfully logged into the SharePoint site, the navigator will display its contents.
To view the SharePoint List contents, simply select the list and click Edit. This will bring you to the Power Query editor where you can use the full power of Power Query to manipulate the data. An interesting option is to include more metadata from the SharePoint list. There is for example a column CreatedBy, which is actually a link to a record that contains more information of the person who created the SharePoint list. When you click the double arrows, you get a list of columns that you can include in your original SharePoint list, such as the name of the creator.
There is however a lot of metadata columns in a SharePoint list that are not immediately useful, so it makes sense to keep only the columns that you need. The easiest option to hide a lot of the columns and select only the columns that you need, right-click and then choose Remove Other Columns. This option makes sure that if new columns are added to the SharePoint list, they will not be included when the query is refreshed.
If you have to select a lot of columns and they are hard to track down, a good alternative is to hit Choose Columns in the Home ribbon. This will give a pop-up where you can easily select the necessary columns using checkboxes.
All that is left is to save the changes in the editor menu and to load the data into Excel.
Working with a non-English SharePoint List using OData
When your SharePoint website has a different locale than English, it's possible Power Query has issues reading the metadata, where you end up with an empty navigator. The problem is described here: Unable to import Non-English SharePoint Lists. Hopefully this gets fixed in a future release of Power Query, but for the moment we have to use a work-around: reading the SharePoint list using OData. You can find the From OData Feed import right beneath the SharePoint List in the From Other Sources dropdown in the Power Query ribbon.
The first step is to enter the URL to the OData feed. This is your SharePoint site with the following path attached: /_vti_bin/ListData.svc.
The second step is again specifying the credentials. In this example the Windows credentials were used again.
When Power Query successfully reads the OData feed, the resulting metadata is displayed in the Navigator. This time there are a lot more lists to read from, but your SharePoint list should be present as well. By selecting it and choosing Edit, Power Query is opened and there is no difference as when we were reading the SharePoint list directly.
Conclusion
In this tip we showed how easily Power Query can read a SharePoint list. You can authenticate using Windows credentials. If the SharePoint site is a non-English site, you can read the list using OData instead.
Next Steps
- For an introduction to Power Query, see Introduction to Power Query for Excel.
- In the tip Using the New OData Source in SQL Server Integration Services, you can read how to read OData feeds with Integration Services.
- If you want to know more about Power Query and the M language, check out the book Power Query for Power BI and Excel by SQL Server MVP Chris Webb. Since Power Query is frequently updated, some parts may be already a bit dated. Most likely the user interface has changed the most, but most of the M language is still unchanged.
Last Updated: 2015-05-15
About the author
Koen Verbeeck is a BI professional, specializing in the Microsoft BI stack with a particular love for SSIS.
View all my tips
View all my tips
As I’ve explained many times before, querying SharePoint data directly is a bad idea. The SharePoint data storage mechanisms simply aren’t designed for querying of any scale, hence the lookup limitations that have been imposed upon it. The best approach to querying SharePoint list data is to first load it into a data warehouse or data mart of some sort. However, both Reporting Services (SSRS) and Power Query support direct access to SharePoint lists. While I try to strongly dissuade people from doing this with Reporting Services, properly used, Power Query is a totally viable means of querying SharePoint list data.
Why is this? With SSRS, every query goes back to the data source for retrieval. Power Query is different – it’s analogous to SQL Server Integration Services, which is an ETL management product. It loads source data into a repository, in this case, an embedded xVelocity, or Power Pivot model which can be considered a “personal data warehouse”. Queries against this mini data warehouse are fast, and don’t rely on SharePoint retrieval mechanisms, and can be used quite effectively in reports.
There are a couple of subtleties to querying SharePoint list items with Power Query, and I will briefly walk through the process below.
With Excel open, click the Power Query tab, select “From Other Sources” and the select “From SharePoint List”.
Next, enter the URL for the SharePoint site (or subsite) that contains the list you wish to query.
If it is the first time accessing this site, you will be prompted for credentials. If your site is Office365, be sure to enter organizational credentials. If it is on premise, use Windows credentials.
Once entered, you will be presented with a list of SharePoint lists in the Power Query Navigator window. Select the list that you wish to query, in our case, Announcements. When selected, click the edit button to edit the query.
The data, or a subset will load into the query editor window. You will see all of the list item fields expressed as columns, and for the most part, using the correct data type. At this point you can remove any columns that are unnecessary, or filter any undesired rows. There are a couple of SharePoint field types that bear special mention.
Lookup fields are a lookup into another SharePoint list. Internally, the SharePoint item stores this as an ID and display value, but Power Query gives you access to all of the properties of the related item as a one-to-one relationship. Essentially, what you can do is to flatten that relationship by incorporating the related item’s attributes.
If you scroll to a column of this particular type, you will see the value expressed as a hyperlink with the value “Record”. Clicking on it will drill down to one related record, but that’s not what we want to do. We want to expand the properties for all items in the list. The way that you do this is to click on the expand icon in the column header. In our case, we want to expand the “CreatedBy” field. CreatedBy is a standard list field, of the Person type. Person fields are actually a special case of a lookup field, so it exhibits this behaviour.
Here, we are interested in retrieving the user’s name and mobile phone, so we deselect all of the other fields. A new column will be created for every expanded field in the format sourcefieldname.attributename .
Attachments are another special case. There can be multiple attachments for a single list item, a one to many relationship. The hyperlink is therefore “Table”. Clicking on the column header expand for this column looks similar, but with an important difference. Options are available to either expand or aggregate the related items.
Selecting expand will create a new source record for each related item, and the only columns that will differ will be the items selected from the related table (Name in our case). Aggregate will not create any new records, but will summarize the related fields. For numeric fields, they can be totalled or averaged, and for text fields they can be counted.
Once ready, click “Close and Load” from the Query Editor ribbon, and the list data will load to either your model, or your workbook, depending on what your preferences are. Of course, I always recommend that you load to the model only.
Once loaded, any visualizations and queries will work against the model. The data can be refreshed at any point either manually, or automatically if using the Data Management Gateway. Keep in mind however that refreshes will operate against the source list.
Advertisements