首页 > python > Python json.loads object_hook = OrderedDict无效

Python json.loads object_hook = OrderedDict无效 (Python json.loads object_hook=OrderedDict not working)

2018-12-18 pythonjson

问题

我有以下json文件:

{
  "glossary": {
    "title": "example glossary",
    "GlossaryID": "5302",
    "GlossDiv": {
      "title": "S",
      "GlossList": {
        "GlossEntry": {
          "ID": "SGML",
          "SortAs": "SGML",
          "GlossTerm": "Standard Generalized Markup Language",
          "Acronym": "SGML",
          "Abbrev": "ISO 8879:1986",
          "GlossDef": {
            "para": "A meta-markup language, used to create markup languages such as DocBook.",
            "GlossSeeAlso": [
              "GML",
              "XML"
            ]
          },
          "GlossSee": "markup"
        }
      }
    }
  }
}

我正在使用以下命令读取此json文件:

data = json.loads(str,object_hook = OrderedDict)

但是,它仍然不能保持插入的顺序:

OrderedDict([
  (u'glossary',
  OrderedDict([
    (u'GlossDiv',
    OrderedDict([
      (u'GlossList',
      OrderedDict([
        (u'GlossEntry',
        OrderedDict([
          (u'GlossDef',
          OrderedDict([
            (u'GlossSeeAlso',
            [
              u'GML',
              u'XML'
            ]),
            (u'para',
            u'A meta-markup language, used to create markup languages such as DocBook.')
          ])),
          (u'GlossSee',
          u'markup'),
          (u'Acronym',
          u'SGML'),
          (u'GlossTerm',
          u'Standard Generalized Markup Language'),
          (u'Abbrev',
          u'ISO 8879:1986'),
          (u'SortAs',
          u'SGML'),
          (u'ID',
          u'SGML')
        ]))
      ])),
      (u'title',
      u'S')
    ])),
    (u'GlossaryID',
    u'5302'),
    (u'title',
    u'example glossary')
  ]))
])

我循环遍历字典中的项目并列出根元素及其元素。我希望它与json文件中的顺序相同。

我在json中寻找结构和数组,每个数组或结构对我来说都是一个不同的表。所以我希望输出为:

Glossary-
title:example glossary,
GlossaryID:5302

GlossDiv-
title:S

GlossEntry-
ID: SGML,
SortAs: SGML,
GlossTerm: Standard Generalized Markup Language,
Acronym: SGML,
Abbrev: ISO 8879:1986,
GlossSee: markup

等等。但是,因为它没有维持秩序我得到它:

glossary
GlossDiv
GlossList
GlossEntry
GlossDef
GlossSeeAlso
para : A meta-markup language, used to create markup languages such as DocBook.
GlossSee : markup
Acronym : SGML
GlossTerm : Standard Generalized Markup Language
Abbrev : ISO 8879:1986
SortAs : SGML
ID : SGML
title : S
GlossaryID : 5302
title : example glossary

解决方法

当您使用该object_hook参数时,解码器将首先将映射重建为普通字典,然后将该字典传递给给定的钩子。这将丢失物品的订购。

据推测,你在3.7之前使用的是python版本(默认情况下在3.7中命名为dicts),如果你检查json你的版本的模块文档(例如3.6),你会在object_pairs_hook参数中找到答案:

object_pairs_hook是一个可选函数,将使用有序的对列表对解码的任何对象文字的结果进行调用。将使用object_pairs_hook的返回值而不是dict。此功能可用于实现依赖于键和值对被解码的顺序的自定义解码器(例如,collections.OrderedDict()将记住插入的顺序)。如果还定义了object_hook,则object_pairs_hook优先。

替换object_hookobject_pairs_hook,这应该做你想要的。

问题

I have the following json file:

{
  "glossary": {
    "title": "example glossary",
    "GlossaryID": "5302",
    "GlossDiv": {
      "title": "S",
      "GlossList": {
        "GlossEntry": {
          "ID": "SGML",
          "SortAs": "SGML",
          "GlossTerm": "Standard Generalized Markup Language",
          "Acronym": "SGML",
          "Abbrev": "ISO 8879:1986",
          "GlossDef": {
            "para": "A meta-markup language, used to create markup languages such as DocBook.",
            "GlossSeeAlso": [
              "GML",
              "XML"
            ]
          },
          "GlossSee": "markup"
        }
      }
    }
  }
}

I am reading this json file using following command:

data = json.loads(str,object_hook=OrderedDict)

But, it still doesn't maintain the order of insertion:

OrderedDict([
  (u'glossary',
  OrderedDict([
    (u'GlossDiv',
    OrderedDict([
      (u'GlossList',
      OrderedDict([
        (u'GlossEntry',
        OrderedDict([
          (u'GlossDef',
          OrderedDict([
            (u'GlossSeeAlso',
            [
              u'GML',
              u'XML'
            ]),
            (u'para',
            u'A meta-markup language, used to create markup languages such as DocBook.')
          ])),
          (u'GlossSee',
          u'markup'),
          (u'Acronym',
          u'SGML'),
          (u'GlossTerm',
          u'Standard Generalized Markup Language'),
          (u'Abbrev',
          u'ISO 8879:1986'),
          (u'SortAs',
          u'SGML'),
          (u'ID',
          u'SGML')
        ]))
      ])),
      (u'title',
      u'S')
    ])),
    (u'GlossaryID',
    u'5302'),
    (u'title',
    u'example glossary')
  ]))
])

I am looping through the items in the dictionary and listing out the root element and its elements. I want it in the same order as it is in the json file.

I am looking for structures and arrays in the json and each array or structure will be a different table for me. So I want the output as:

Glossary-
title:example glossary,
GlossaryID:5302

GlossDiv-
title:S

GlossEntry-
ID: SGML,
SortAs: SGML,
GlossTerm: Standard Generalized Markup Language,
Acronym: SGML,
Abbrev: ISO 8879:1986,
GlossSee: markup

and so on. But, because its not maintaining the order I am getting it as:

glossary
GlossDiv
GlossList
GlossEntry
GlossDef
GlossSeeAlso
para : A meta-markup language, used to create markup languages such as DocBook.
GlossSee : markup
Acronym : SGML
GlossTerm : Standard Generalized Markup Language
Abbrev : ISO 8879:1986
SortAs : SGML
ID : SGML
title : S
GlossaryID : 5302
title : example glossary

解决方法

When you use the object_hook parameter, the decoder will first reconstruct the mapping as a plain dict, then pass that dict to the given hook. This will lose the ordering of the items.

Presumably you're using a version of python before 3.7 (as dicts became ordered by default in 3.7), and if you check the json module documentation for your version (eg. 3.6), you'll find the answer in the object_pairs_hook parameter:

object_pairs_hook is an optional function that will be called with the result of any object literal decoded with an ordered list of pairs. The return value of object_pairs_hook will be used instead of the dict. This feature can be used to implement custom decoders that rely on the order that the key and value pairs are decoded (for example, collections.OrderedDict() will remember the order of insertion). If object_hook is also defined, the object_pairs_hook takes priority.

Replace object_hook with object_pairs_hook, and this should do what you're looking for.

相似信息