dataverse in china: internationalization, curation and promotion by yin shenqin

25
Dataverse in China: internationalization, curation and promotion 1 Yin Shenqin Fudan University Social Science Data Research Center ShanghaiChina [email protected]

Upload: datascienceiqss

Post on 12-Aug-2015

256 views

Category:

Education


0 download

TRANSCRIPT

Dataverse in China: internationalization, curation and

promotion

1

Yin Shenqin Fudan University

Social Science Data Research Center Shanghai, China [email protected]

contents

Introduction

Internationalization

Data Curation

Promotion

2

Fudan social science data platform project

Start up in 2012

3

Presenter
Presentation Notes
Fudan university start up the project of social science data platform in 2012.

project development plan

4

Presenter
Presentation Notes
软件汉化 Software Chinese build ; software localization ; Hiapk-Eric ; chinese compiler he development plan of proceed in five phases: in 2012, we made research on 35 social sciences data centers in the world by literature review , website research and on-site investigation. In 2013,we deployed four social science platforms, after a six-month test, finally we chose dataverse as our social science data platform to be localized and secondary developed. Fudan University social science platform was in pilot run on Jun 2014.

Publicly launch

Dec 29,2014 official release of fudan dataverse repository 18 medias took part in the ceremony

and report the news.

5

Data collections

46000 files 1319 researchers 5796 projects survey and census data text files, data files (i.e, dta, spss,

xls, csv, etc), image files, and GIS data

6

Presenter
Presentation Notes
The Fudan Dataverse repository currently has datasets ranging from: research findings, working papers, journal papers, and social science data. The majority of their current datasets are survey and census data deposited by Fudan University affiliated researchers working in demography, economics, social science, geography, etc. The most common file types they see include: text files, data files (i.e. dta, spss, xls, csv, etc), image files (i.e. jpg), and GIS data.

Larger collections

National wide Population Census Fudan Yangtze River Delta Social

Transformation Survey (FYRST) Population, Consumption and Carbon

Emission Fudan energy

7

Presenter
Presentation Notes
目前已整合包括长三角社会变迁调查、杭州都市经济圈数据、能源流向与碳排放因子数据库数据库等共计数据集54个,项目661 个,文件1041个。 2014年6月底,已经将文科科研处提供的1319名教师、5153个文科项目、45835个论文成果数据导入数据平台并建立帐号。可通过邮件等方式通知教师自行上网确认、激活

contents

Introduction

Internationalization

Data Curation

Promotion

8

,Localization, secondary development, and internationalization

In 2013, made a Chinese version and secondary development based on DVN3.3 Chinese search engine, Chinese word

segmentation Navigation with Chinese character

index Online analysis in supporting Chinese

9

Presenter
Presentation Notes
2013年DVN3.3 版本的汉化和二次开发,在线分析 哈佛合作DVN4.0的国际化和汉化工作 从2014年上半年开始至今,复旦大学社会科学数据平台项目组和哈佛联络,保持密切联系,参与哈佛大学组织的DVN新版本的国际化和汉化。 保证基本上与哈佛源代码的更新同步进行国际化和汉化。 Create a fork of the project in github. They would then make their changes, their commit and push, and then make a "pull request" to the main branch. We would then evaluate the change and accept it. �The one key thing would be for them to make sure to pull from our project to the fork once a day so that when they do make the pull request, there are no merge issues

Files internationalized

In 2014-2015, collaborating with Harvard IQSS to internationalize DVN 4.0

39 frontgroud files

41 background files

10

Presenter
Presentation Notes
前台文件共21个,后台文件共295个,需要其中每一个文件中检查需要汉化的部分,最终将其整理为14个国际化文件 Front file of 21, a total of 295 spool file, you need to check the required finished part which each file, and ultimately be consolidated into 14 international file includes 538 phrases, 2289 words. 国际化文件就是在原文件改的,做成的配置文件就一个,加个汉化的配置文件 Internationalization file is made from the original file, and add a localization of the configuration file

internationalization process

Files to be internationalized mainly from three parts: Xhtml files @front desk Java files @backend JSF's built-in validation message

11

Presenter
Presentation Notes
需要国际化的文件主要来自三个地方: 前台xhtml文件 后台Java文件 JSF自带的验证消息  need international file mainly from three parts:�The XHTML file Ø at the front desk�Ø backend Java file�JSF's built-in validation message

Tools LingoHub:translating from English

into Chinese

12

Intl files *.properties

•Draft •Translated •Reviewed

Presenter
Presentation Notes
哈佛方建议将国际化文件配置到LingoHub上,便于汉语以及其他语种的翻译过程,LingoHub是新生的翻译软件,目前存在部分Bug或不合理设置,使用过程中有所不便,在多次沟通后,LingoHub在升级时进行了更改

目录

Introduction

Internationalization

Data Curation

Promotion

13

Data Curation

Help Researchers: Documenting and providing context

for data Organizing and formatting data Storing and transferring data Rights of using data

14

Presenter
Presentation Notes
help Faculty and researchers with guiding them to use the Dataverse, colloborate with them to build dataverse for  research individuals or research team, also bulid specail data colletions.� According to the standards and rules of data curation, help researchers make data process and upload the data. �  supplement the data specification and notes, complete the descriptive information in detail, convert the data format, and protect data privacy etc, according to the different levels of  study, file and data

For example

15

Before Data stored in researcher’s local disks

Presenter
Presentation Notes
Dr. Lu Weidong, Center for Historical Geographical Studies of Fudan University We have processed 19 Filefolders and build 27 studies, and upload 127 files. 做一个 before after

For example

16

After Preserved in Fudan Dataverse Repository

Presenter
Presentation Notes
Data process, format conversion, data specification, data description, data navigation, data ingestion and so on. making

Data services data process, format conversion, data

description in detail, supplementary metadata, identifying the data sources using handle system.

17

Presenter
Presentation Notes
providing a series of data services, including data process, format conversion, data description in detail, supplementary metadata, identifying the data sources using handle system. Without descriptive metadata & a mechanism to maintain it, a data lake could turn into a data swamp

contents

Introduction

Internationalization

Data Curation

Promotion

18

Social Media Promo videos Weibo( like twitter) We Chat (the biggest

mobile social media in China) WeChat Community QQ Community

19

Presenter
Presentation Notes
Release articles every Monday, Wenesday All about data

broadcasting

Series of I speak for Fudan University Dataverse

Mr. Pan Kexi代言

Ms. Li Yun(micro media)

Ms. Li Yun(we chat)

Mr. Lu Weidong(We Chat)

Presenter
Presentation Notes
制作“我为社会科学数据平台”微媒体宣传片两个,潘克西老师和研究生李赟。 通过复旦大学主页、复旦大学图书馆的微博、微信等途径进行宣传。 制作海报和三折页进行分发。

Posters

21

actions

Top 9 universities alliance on RDM

2 Chinese domestic seminars

reports on dataverse at 7 nationwide academic conferences between 2013-2015

22

Nine universities alliance

China Academic Library Research Data Management Implementation Group

23

Presenter
Presentation Notes
Nine Universities, including Peking University, Tsinghua University, Zhejiang University, Wuhan University, Beijing Institute of Technology, Shanghai Jiaotong University, Shanghai International Studies University, Tongji University, and Fudan University, jointly initiated the establishment of “China Academic Library Research Data Management Implementation Group”, in order to promote the development of Chinese domestic Research Data

Two seminars On October, 2014, a

seminar on Scientific Research and Practice of China academic Library a seminar on

Shanghai academic scientific platform and institutional repository

24

Presenter
Presentation Notes
On October, 2014, a seminar on Scientific Research and Practice of China academic Library a seminar on Shanghai academic scientific platform and institutional repository, were held in Fudan University. More than 50 deputy directors and technologist from twenty academic libraries took part in the seminars.

Contact

Yin Shenqin Fudan university Social Science Data Research center [email protected] @JasmineKanjur

25