Institutions can use the batch management tool to migrate repository items into their Figshare repository. Items can be linked to user accounts and uploaded to a specific group. Mapping and formatting metadata from an old system to the Figshare system is usually the most time consuming part. This help page provides some guidance on this process as every institution will have unique migration needs.
IMPORTANT NOTE: Please ensure that the metadata that you intend to publish using batch management is thoroughly tested on your stage environment, to prevent the need to make changes after initial publication. Before publishing and/or updating items using batch management, it is very important to be aware of changes to metadata that cause item versioning. Figshare does not support removal of versions. While admin users can unpublish individual items (in order to update and republish without creating a new version), removal of versions in batch is not supported. In the event that you unexpectedly create multiple versions via batch management, our Support Team will charge for assisting to remove versions in batch, and the timeframe for carrying out the work cannot be considered as urgent.
At this point you should have your repository set up with the groups and custom metadata fields you will need for the migration. We advise using your stage instance to create dummy draft items in each group that you will be migrating into. In these dummy items, fill in any custom metadata fields, add categories, keywords, funding, and create some examples of embargo files. You can use the batch management tool to download the metadata from these dummy items to use as a template for uploading.
Three other pieces of advice:
You may want to use separate CSV files to upload items by group rather than migrating items to different groups using one CSV file. This will make it easier to make sure the metadata in the CSV is formatted correctly and that the custom fields are filled out properly.
You can get a list of your group ids from this endpoint: https://docs.figshare.com/#private_institution_groups_list (Make sure you create an API token from a top level administrator account and paste it in the top left field on that API documentation page.)
Put the group id in the group_id column in the CSV and items will automatically be associated with that group.
TL;DR: use account_id to put records in an author’s repository account so they can edit the record. Use user_id to affiliate a specific author with a record whether they own the record or not.
In Figshare, there can be two ids associated with a researcher. If a researcher has an account in your repository (whether created by SSO, HRFeed, or manually) they will have an account id. If a researcher is listed as an author on a repository item, whether they have an account or not, they will have a user id. You can use the account id to give edit access to a researcher for migrated items. A researcher with an account will also have a profile. But please note that the profile page uses the person’s user id since that is how they are associated to records, whether they ‘own’ the record or not. You can use the user id when uploading metadata to make sure items show up in researcher profiles. This also enables better reporting because it reduces duplicated author names across repository items.
If you want researchers to have edit access to an item, you need to put their account id in the account_id column in the CSV. If you do not add an account_id, the item will be uploaded to the administrator account that is running the batch upload.
You will need to have the researcher accounts created before migration in order to get the account_id (to provide edit access), and the user_id (to link them as an author if needed). The best way to do this is create the accounts manually through the API and add the SSO id in the “institution_user_id”. You can also create accounts through an HRfeed or, once your repository is launched, you can ask researchers to login through SSO which will automatically create their account. You can then retrieve the account_id and user_id as needed.
You can see the account_ids either in the User Report or from this API endpoint: https://docs.figshare.com/#private_institution_accounts_list.
You could upload author names as ‘first name’ and ‘last name’ for each item but this is not recommended for existing authors. Each first/last name combination will receive its own database entry - lots of duplicates! - and will make reporting more difficult than it needs to be.
To connect an item with an existing author account, simply add the author to the CSV item using the user_id. You can see the user_ids either in the User Report or from this API endpoint: https://docs.figshare.com/#private_institution_accounts_list. Add the authors in the ‘authors’ column in this format:
[{"id": 1438453}, {"id": 1438451}, {"id": 701402}, {"id": 1438455}]
If you add any other data (like “first name”) after the id value, it will be ignored. Adding authors in this way will automatically connect the item to the author information stored in the database including ORCID and CRIS/RIM id. The item will show up in each author’s profile if they have an account in any Figshare powered repository.
Important Notes:
If your repository will be integrated with a CRIS/RIM system like Symplectic Elements, you will need to add authors using the user_id so that the item can be harvested into the CRIS/RIM system properly.
If you want an item in your repository to show up in a user account in another Figshare repository, like in figshare.com or at another university, you will need to get that author’s user id from their profile page. This person’s user id is the number at the end of the profile URL: https://figshare.com/authors/_/473204.
You can add funding information as free text by including the grant name in the funding column using this format:
[{"title": "My grant 1"},{"title": "My grant 2"}]
You can also link grants from large funders to the grant item in the Dimensions database. At this time, this is a two step process. You need to find the id for the grant in the Figshare system using this API endpoint: https://docs.figshare.com/#private_funding_search (you’ll need to add your API token to the field in the top left). If you find the grant, add the id to the funding column like this:
[{"id": 9621728},{"id": 3058082}]
The two grant items for those ids are pictured below.
Links to related materials, like a published paper, dataset, or different version of a paper, are added in the related_materials column. As with authors and funding, the content needs to be formatted as JSON. This is an example:
[{"identifier": "10.1038/s41550-020-1208-y", "title": "The ecological impact of high-performance computing in astrophysics", "identifier_type": "DOI", "relation": "IsSupplementTo", "is_linkout": 1}]
The ‘identifier_type’ field is sourced from DataCite’s RelatedIdentifier list. The ‘relation’ field is sourced from DataCite’s list of relation types. The ‘is_linkout’ field determines if the linked title shows up in a call-out box on the record page and can take the value 0 (zero) or 1 (one). Up to five links can be shown as call-out boxes.
Files need to be available to the Figshare system whether from a web server, another service like Dropbox, or an FTP location. Add a column called ‘files’ to the CSV and add file URLs like this:
["https://journals.aqs.org/pdf/10.1103","ftp://mirror.easyname.at/ubuntu-releases/robots.txt"]
This would upload two files, each from a different location, into one item. Notice that the second file is coming from an FTP server. Ideally, the files are already publicly available through your legacy repository and you can use the URLs from there.
You may want to upload items that already have a DOI or a Handle. Add the DOI or Handle in the appropriate column without the URL information (e.g. 10.1636/P10-15.1).
Every institution will have unique migration needs. Please use this as just an example. This workflow assumes the records will be migrated into one administrator account for editing but the items are linked to existing researcher user_ids.
Can’t find your answer here, check the community discussion or raise a support ticket.
Share this article: